* [PATCH 1/6] nfsd: size fh_verify server sockaddr slot by xpt_locallen
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
@ 2026-05-31 12:06 ` Jeff Layton
2026-05-31 12:06 ` [PATCH 2/6] nfsd: release path refs on follow_down() error Jeff Layton
` (5 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 12:06 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Al Viro, Rick Macklem, Chris Mason
Cc: linux-nfs, linux-kernel, Jeff Layton
From: Chris Mason <clm@meta.com>
The nfsd_fh_verify and nfsd_fh_verify_err tracepoints declare the
server sockaddr slot sized by xpt_remotelen but fill it from
xpt_local using xpt_locallen:
TP_STRUCT__entry(
...
__sockaddr(server, rqstp->rq_xprt->xpt_remotelen)
...
)
TP_fast_assign(
...
__assign_sockaddr(server, &rqstp->rq_xprt->xpt_local,
rqstp->rq_xprt->xpt_locallen);
...
)
When xpt_locallen exceeds xpt_remotelen, __assign_sockaddr's memcpy
writes past the reserved ring-buffer slot. In the reverse direction
(xpt_locallen < xpt_remotelen) the slot is oversized and the
unwritten tail leaks prior ring-buffer contents to trace consumers.
The write-past-end case is reachable on NFS/UDP. svc_xprt_set_remote()
is only called from svc_tcp_accept() (net/sunrpc/svcsock.c) and from
the RDMA connect path; svc_create_socket() for UDP calls only
svc_xprt_set_local(), so xpt_remotelen stays 0 for the xprt's
lifetime. Every fh_verify trace for an NFSv2/v3-over-UDP request
then copies 16 or 28 bytes from xpt_local into a zero-byte slot.
The other NFSD tracepoints that record the server address
(NFSD_TRACE_PROC_CALL_FIELDS, NFSD_TRACE_PROC_RES_FIELDS,
SVC_RQST_ENDPOINT_FIELDS) already size the server slot by
xpt_locallen; nfsd_fh_verify and nfsd_fh_verify_err were the only
exceptions.
Fix by sizing the server slot with xpt_locallen so the declared slot
matches the copy length. The client slot and its assignment already
agree on xpt_remotelen and are left untouched.
Fixes: 051382885552 ("NFSD: Instrument fh_verify()")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Chris Mason <clm@meta.com>
---
fs/nfsd/trace.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 9917c0440522..db0a0dc70660 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -272,7 +272,7 @@ TRACE_EVENT_CONDITION(nfsd_fh_verify,
TP_CONDITION(rqstp != NULL),
TP_STRUCT__entry(
__field(unsigned int, netns_ino)
- __sockaddr(server, rqstp->rq_xprt->xpt_remotelen)
+ __sockaddr(server, rqstp->rq_xprt->xpt_locallen)
__sockaddr(client, rqstp->rq_xprt->xpt_remotelen)
__field(u32, xid)
__field(u32, fh_hash)
@@ -311,7 +311,7 @@ TRACE_EVENT_CONDITION(nfsd_fh_verify_err,
TP_CONDITION(rqstp != NULL && error),
TP_STRUCT__entry(
__field(unsigned int, netns_ino)
- __sockaddr(server, rqstp->rq_xprt->xpt_remotelen)
+ __sockaddr(server, rqstp->rq_xprt->xpt_locallen)
__sockaddr(client, rqstp->rq_xprt->xpt_remotelen)
__field(u32, xid)
__field(u32, fh_hash)
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 2/6] nfsd: release path refs on follow_down() error
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
2026-05-31 12:06 ` [PATCH 1/6] nfsd: size fh_verify server sockaddr slot by xpt_locallen Jeff Layton
@ 2026-05-31 12:06 ` Jeff Layton
2026-06-01 18:47 ` Al Viro
2026-05-31 12:07 ` [PATCH 3/6] nfsd: fix nfsd_file leak on inter-server COPY setup failure Jeff Layton
` (4 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 12:06 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Al Viro, Rick Macklem, Chris Mason
Cc: linux-nfs, linux-kernel, Jeff Layton
From: Chris Mason <clm@meta.com>
nfsd_cross_mnt() initializes a local struct path with mntget() and
dget() before calling follow_down(). On a negative return the error
arm jumps to out without releasing those references:
err = follow_down(&path, follow_flags);
if (err < 0)
goto out;
follow_down() never drops the caller's entry-time refs on any error
sub-case; for example a pre-cross d_manage() failure leaves path
untouched, so the mntget()/dget() taken on entry survive the call.
Every other early-exit arm in nfsd_cross_mnt() (other-namespace
return, IS_ERR(exp2), and the success tail after the swap) already
calls path_put(&path); the err < 0 arm is the lone omission. The
leak inflates mnt_count and d_count on each failed cross-mount,
blocking umount and pinning dentries against the shrinker, and is
reachable by any authenticated NFS client through nfsd_lookup_dentry
or the NFSv4 READDIR encode path.
Fix by calling path_put(&path) before the goto out in the err < 0
arm so the entry-time refs are released on all follow_down() error
returns.
Fixes: cc53ce53c869 ("Add a dentry op to allow processes to be held during pathwalk transit")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Chris Mason <clm@meta.com>
---
fs/nfsd/vfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 62b56d73432a..95ce15440492 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -137,8 +137,10 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
follow_flags = LOOKUP_AUTOMOUNT;
err = follow_down(&path, follow_flags);
- if (err < 0)
+ if (err < 0) {
+ path_put(&path);
goto out;
+ }
if (path.mnt == exp->ex_path.mnt && path.dentry == dentry &&
nfsd_mountpoint(dentry, exp) == 2) {
/* This is only a mountpoint in some other namespace */
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 2/6] nfsd: release path refs on follow_down() error
2026-05-31 12:06 ` [PATCH 2/6] nfsd: release path refs on follow_down() error Jeff Layton
@ 2026-06-01 18:47 ` Al Viro
2026-06-01 18:57 ` Jeff Layton
0 siblings, 1 reply; 15+ messages in thread
From: Al Viro @ 2026-06-01 18:47 UTC (permalink / raw)
To: Jeff Layton
Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Rick Macklem, Chris Mason, linux-nfs, linux-kernel
On Sun, May 31, 2026 at 08:06:59AM -0400, Jeff Layton wrote:
> Fix by calling path_put(&path) before the goto out in the err < 0
> arm so the entry-time refs are released on all follow_down() error
> returns.
Might be better to unify all those path_put()... Something like this,
perhaps?
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index eafdf7b7890f..bdcd78f49112 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -141,13 +141,13 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
if (path.mnt == exp->ex_path.mnt && path.dentry == dentry &&
nfsd_mountpoint(dentry, exp) == 2) {
/* This is only a mountpoint in some other namespace */
- path_put(&path);
goto out;
}
exp2 = rqst_exp_get_by_name(rqstp, &path);
if (IS_ERR(exp2)) {
err = PTR_ERR(exp2);
+ exp2 = NULL;
/*
* We normally allow NFS clients to continue
* "underneath" a mountpoint that is not exported.
@@ -157,10 +157,7 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
*/
if (err == -ENOENT && !(exp->ex_flags & NFSEXP_V4ROOT))
err = 0;
- path_put(&path);
- goto out;
- }
- if (nfsd_v4client(rqstp) ||
+ } else if (nfsd_v4client(rqstp) ||
(exp->ex_flags & NFSEXP_CROSSMOUNT) || EX_NOHIDE(exp2)) {
/* successfully crossed mount point */
/*
@@ -174,9 +171,10 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
*expp = exp2;
exp2 = exp;
}
- path_put(&path);
- exp_put(exp2);
out:
+ path_put(&path);
+ if (exp2)
+ exp_put(exp2);
return err;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 2/6] nfsd: release path refs on follow_down() error
2026-06-01 18:47 ` Al Viro
@ 2026-06-01 18:57 ` Jeff Layton
2026-06-01 19:31 ` Al Viro
2026-06-01 19:43 ` Chuck Lever
0 siblings, 2 replies; 15+ messages in thread
From: Jeff Layton @ 2026-06-01 18:57 UTC (permalink / raw)
To: Al Viro
Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Rick Macklem, Chris Mason, linux-nfs, linux-kernel
On Mon, 2026-06-01 at 19:47 +0100, Al Viro wrote:
> On Sun, May 31, 2026 at 08:06:59AM -0400, Jeff Layton wrote:
>
>
> > Fix by calling path_put(&path) before the goto out in the err < 0
> > arm so the entry-time refs are released on all follow_down() error
> > returns.
>
> Might be better to unify all those path_put()... Something like this,
> perhaps?
>
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index eafdf7b7890f..bdcd78f49112 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -141,13 +141,13 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
> if (path.mnt == exp->ex_path.mnt && path.dentry == dentry &&
> nfsd_mountpoint(dentry, exp) == 2) {
> /* This is only a mountpoint in some other namespace */
> - path_put(&path);
> goto out;
> }
>
> exp2 = rqst_exp_get_by_name(rqstp, &path);
> if (IS_ERR(exp2)) {
> err = PTR_ERR(exp2);
> + exp2 = NULL;
> /*
> * We normally allow NFS clients to continue
> * "underneath" a mountpoint that is not exported.
> @@ -157,10 +157,7 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
> */
> if (err == -ENOENT && !(exp->ex_flags & NFSEXP_V4ROOT))
> err = 0;
> - path_put(&path);
> - goto out;
> - }
> - if (nfsd_v4client(rqstp) ||
> + } else if (nfsd_v4client(rqstp) ||
> (exp->ex_flags & NFSEXP_CROSSMOUNT) || EX_NOHIDE(exp2)) {
> /* successfully crossed mount point */
> /*
> @@ -174,9 +171,10 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
> *expp = exp2;
> exp2 = exp;
> }
> - path_put(&path);
> - exp_put(exp2);
> out:
> + path_put(&path);
> + if (exp2)
> + exp_put(exp2);
> return err;
> }
>
Looks reasonable. Chuck has already taken this patch into his tree, but
we could do a cleanup on top. Want to send us an "official" patch?
Thanks,
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 2/6] nfsd: release path refs on follow_down() error
2026-06-01 18:57 ` Jeff Layton
@ 2026-06-01 19:31 ` Al Viro
2026-06-01 19:38 ` Jeff Layton
2026-06-01 19:43 ` Chuck Lever
1 sibling, 1 reply; 15+ messages in thread
From: Al Viro @ 2026-06-01 19:31 UTC (permalink / raw)
To: Jeff Layton
Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Rick Macklem, Chris Mason, linux-nfs, linux-kernel
On Mon, Jun 01, 2026 at 02:57:16PM -0400, Jeff Layton wrote:
> Looks reasonable. Chuck has already taken this patch into his tree, but
> we could do a cleanup on top. Want to send us an "official" patch?
Not a problem, but it's completely untested. It compiles, but...
unify cleanups in nfsd_cross_mnt() exits
Instead of having a separate path_put() on each failure exit, as well as
on the normal path, let's move all of those past the point where these
codepaths join. We want to keep the ordering between path_put() and
exp_put(), so move that one as well.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 95ce15440492..cfac0cc4207c 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -137,20 +137,19 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
follow_flags = LOOKUP_AUTOMOUNT;
err = follow_down(&path, follow_flags);
- if (err < 0) {
- path_put(&path);
+ if (err < 0)
goto out;
- }
+
if (path.mnt == exp->ex_path.mnt && path.dentry == dentry &&
nfsd_mountpoint(dentry, exp) == 2) {
/* This is only a mountpoint in some other namespace */
- path_put(&path);
goto out;
}
exp2 = rqst_exp_get_by_name(rqstp, &path);
if (IS_ERR(exp2)) {
err = PTR_ERR(exp2);
+ exp2 = NULL;
/*
* We normally allow NFS clients to continue
* "underneath" a mountpoint that is not exported.
@@ -160,10 +159,7 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
*/
if (err == -ENOENT && !(exp->ex_flags & NFSEXP_V4ROOT))
err = 0;
- path_put(&path);
- goto out;
- }
- if (nfsd_v4client(rqstp) ||
+ } else if (nfsd_v4client(rqstp) ||
(exp->ex_flags & NFSEXP_CROSSMOUNT) || EX_NOHIDE(exp2)) {
/* successfully crossed mount point */
/*
@@ -177,9 +173,10 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
*expp = exp2;
exp2 = exp;
}
- path_put(&path);
- exp_put(exp2);
out:
+ path_put(&path);
+ if (exp2)
+ exp_put(exp2);
return err;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 2/6] nfsd: release path refs on follow_down() error
2026-06-01 19:31 ` Al Viro
@ 2026-06-01 19:38 ` Jeff Layton
0 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2026-06-01 19:38 UTC (permalink / raw)
To: Al Viro
Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Rick Macklem, Chris Mason, linux-nfs, linux-kernel
On Mon, 2026-06-01 at 20:31 +0100, Al Viro wrote:
> On Mon, Jun 01, 2026 at 02:57:16PM -0400, Jeff Layton wrote:
>
> > Looks reasonable. Chuck has already taken this patch into his tree, but
> > we could do a cleanup on top. Want to send us an "official" patch?
>
> Not a problem, but it's completely untested. It compiles, but...
>
Thanks Al. I'll toss it onto the pile that I'm testing.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/6] nfsd: release path refs on follow_down() error
2026-06-01 18:57 ` Jeff Layton
2026-06-01 19:31 ` Al Viro
@ 2026-06-01 19:43 ` Chuck Lever
1 sibling, 0 replies; 15+ messages in thread
From: Chuck Lever @ 2026-06-01 19:43 UTC (permalink / raw)
To: Jeff Layton, Al Viro
Cc: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey, David Howells,
Rick Macklem, Chris Mason, linux-nfs, linux-kernel
On 6/1/26 11:57 AM, Jeff Layton wrote:
> On Mon, 2026-06-01 at 19:47 +0100, Al Viro wrote:
>> On Sun, May 31, 2026 at 08:06:59AM -0400, Jeff Layton wrote:
>>
>>
>>> Fix by calling path_put(&path) before the goto out in the err < 0
>>> arm so the entry-time refs are released on all follow_down() error
>>> returns.
>>
>> Might be better to unify all those path_put()... Something like this,
>> perhaps?
>>
>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>> index eafdf7b7890f..bdcd78f49112 100644
>> --- a/fs/nfsd/vfs.c
>> +++ b/fs/nfsd/vfs.c
>> @@ -141,13 +141,13 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
>> if (path.mnt == exp->ex_path.mnt && path.dentry == dentry &&
>> nfsd_mountpoint(dentry, exp) == 2) {
>> /* This is only a mountpoint in some other namespace */
>> - path_put(&path);
>> goto out;
>> }
>>
>> exp2 = rqst_exp_get_by_name(rqstp, &path);
>> if (IS_ERR(exp2)) {
>> err = PTR_ERR(exp2);
>> + exp2 = NULL;
>> /*
>> * We normally allow NFS clients to continue
>> * "underneath" a mountpoint that is not exported.
>> @@ -157,10 +157,7 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
>> */
>> if (err == -ENOENT && !(exp->ex_flags & NFSEXP_V4ROOT))
>> err = 0;
>> - path_put(&path);
>> - goto out;
>> - }
>> - if (nfsd_v4client(rqstp) ||
>> + } else if (nfsd_v4client(rqstp) ||
>> (exp->ex_flags & NFSEXP_CROSSMOUNT) || EX_NOHIDE(exp2)) {
>> /* successfully crossed mount point */
>> /*
>> @@ -174,9 +171,10 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
>> *expp = exp2;
>> exp2 = exp;
>> }
>> - path_put(&path);
>> - exp_put(exp2);
>> out:
>> + path_put(&path);
>> + if (exp2)
>> + exp_put(exp2);
>> return err;
>> }
>>
>
> Looks reasonable. Chuck has already taken this patch into his tree, but
> we could do a cleanup on top. Want to send us an "official" patch?
We can replace any of the patches already in nfsd-testing, it's a topic
branch.
--
Chuck Lever
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/6] nfsd: fix nfsd_file leak on inter-server COPY setup failure
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
2026-05-31 12:06 ` [PATCH 1/6] nfsd: size fh_verify server sockaddr slot by xpt_locallen Jeff Layton
2026-05-31 12:06 ` [PATCH 2/6] nfsd: release path refs on follow_down() error Jeff Layton
@ 2026-05-31 12:07 ` Jeff Layton
2026-05-31 12:07 ` [PATCH 4/6] nfsd: fix dentry ref leak on V4ROOT export filehandle lookup Jeff Layton
` (3 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 12:07 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Al Viro, Rick Macklem, Chris Mason
Cc: linux-nfs, linux-kernel, Jeff Layton
When nfsd4_setup_inter_ssc() fails, nfsd4_copy() returns
nfserr_offload_denied directly, bypassing the out: label where
release_copy_files() would drop the nf_dst reference taken by
nfs4_preprocess_stateid_op(). Each failed inter-server COPY
leaks one nfsd_file, pinning file/inode/dentry/vfsmount.
Fix by setting status and jumping to out: instead of returning
directly.
Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4proc.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9473aeb53f72..017474cd63b5 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2159,16 +2159,14 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
}
status = nfsd4_setup_inter_ssc(rqstp, cstate, copy);
if (status) {
- trace_nfsd_copy_done(copy, status);
- return nfserr_offload_denied;
+ status = nfserr_offload_denied;
+ goto out;
}
} else {
trace_nfsd_copy_intra(copy);
status = nfsd4_setup_intra_ssc(rqstp, cstate, copy);
- if (status) {
- trace_nfsd_copy_done(copy, status);
- return status;
- }
+ if (status)
+ goto out;
}
memcpy(©->fh, &cstate->current_fh.fh_handle,
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 4/6] nfsd: fix dentry ref leak on V4ROOT export filehandle lookup
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
` (2 preceding siblings ...)
2026-05-31 12:07 ` [PATCH 3/6] nfsd: fix nfsd_file leak on inter-server COPY setup failure Jeff Layton
@ 2026-05-31 12:07 ` Jeff Layton
2026-05-31 12:07 ` [PATCH 5/6] nfsd: release OPEN-decoded posix ACLs via op_release Jeff Layton
` (2 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 12:07 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Al Viro, Rick Macklem, Chris Mason
Cc: linux-nfs, linux-kernel, Jeff Layton
nfsd_set_fh_dentry() leaks the dentry reference from
exportfs_decode_fh_raw() when the NFS3_FHSIZE or NFS_FHSIZE
switch cases detect NFSEXP_V4ROOT and goto out. The out: label
calls exp_put() but never dput(dentry), and fhp->fh_dentry was
never assigned so fh_put() cannot compensate.
A crafted NFSv3 filehandle targeting a V4ROOT export's fsid
triggers the leak on every request.
Fixes: 6e171106dce7 ("nfsd: move V4ROOT version check to nfsd_set_fh_dentry()")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfsfh.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
index 429ca5c6ec08..b36915401758 100644
--- a/fs/nfsd/nfsfh.c
+++ b/fs/nfsd/nfsfh.c
@@ -344,15 +344,19 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct net *net,
if (dentry->d_sb->s_export_op->flags & EXPORT_OP_NOWCC)
fhp->fh_no_wcc = true;
fhp->fh_64bit_cookies = true;
- if (exp->ex_flags & NFSEXP_V4ROOT)
+ if (exp->ex_flags & NFSEXP_V4ROOT) {
+ dput(dentry);
goto out;
+ }
break;
case NFS_FHSIZE:
fhp->fh_no_wcc = true;
if (EX_WGATHER(exp))
fhp->fh_use_wgather = true;
- if (exp->ex_flags & NFSEXP_V4ROOT)
+ if (exp->ex_flags & NFSEXP_V4ROOT) {
+ dput(dentry);
goto out;
+ }
}
fhp->fh_dentry = dentry;
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 5/6] nfsd: release OPEN-decoded posix ACLs via op_release
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
` (3 preceding siblings ...)
2026-05-31 12:07 ` [PATCH 4/6] nfsd: fix dentry ref leak on V4ROOT export filehandle lookup Jeff Layton
@ 2026-05-31 12:07 ` Jeff Layton
2026-05-31 15:54 ` Chuck Lever
2026-05-31 12:07 ` [PATCH 6/6] nfsd: fix layout fence worker double-reference race Jeff Layton
2026-05-31 15:53 ` [PATCH 0/6] nfsd: medium-severity bugfixes Chuck Lever
6 siblings, 1 reply; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 12:07 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Al Viro, Rick Macklem, Chris Mason
Cc: linux-nfs, linux-kernel, Jeff Layton
From: Chris Mason <clm@meta.com>
nfsd4_decode_createhow4() calls nfsd4_decode_fattr4(), which allocates
refcounted struct posix_acl objects via posix_acl_alloc() and stores
them in open->op_pacl and open->op_dpacl. These pointers must be
released once the OPEN compound finishes.
When nfsd4_decode_open_claim4() returns a non-seqid-mutating error,
the dispatcher short-circuits before op_func runs:
nfsd4_proc_compound()
opdesc->op_func == nfsd4_open_omfg
if (!seqid_mutating_err(ntohl(op->status)))
return op->status; /* nfsd4_open() never runs */
opdesc->op_release(&op->u) /* must still release op_pacl/op_dpacl */
Before this change OP_OPEN had no .op_release in nfsd4_ops[], and the
release pair lived inside nfsd4_open() at its out_err: label. On the
short-circuit path nfsd4_open() is never invoked, so both posix_acl
refs leak on every malformed OPEN compound that carries valid POSIX
ACL createhow4 attributes.
Add nfsd4_open_release() and wire it as .op_release for OP_OPEN.
posix_acl_release() is NULL-safe, so the single release site covers
both the normal path and the nfsd4_open_omfg short-circuit. Remove
the matching posix_acl_release() pair from nfsd4_open()'s out_err:
label: the compound dispatcher calls op_release unconditionally after
every op, so leaving the in-function pair in place would double-release
op_pacl and op_dpacl on every ACL-bearing OPEN that reaches
nfsd4_open(), underflowing the refcount and freeing the posix_acl
while it is still reachable through op->u.
Fixes: 5fc51dfc2eb1 ("NFSD: Add support for XDR decoding POSIX draft ACLs")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Chris Mason <clm@meta.com>
---
fs/nfsd/nfs4proc.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 017474cd63b5..76de265bb9e1 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -681,8 +681,6 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
nfsd4_cleanup_open_state(cstate, open);
nfsd4_bump_seqid(cstate, status);
out_err:
- posix_acl_release(open->op_dpacl);
- posix_acl_release(open->op_pacl);
return status;
}
@@ -704,6 +702,13 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
return nfsd4_open(rqstp, cstate, &op->u);
}
+static void
+nfsd4_open_release(union nfsd4_op_u *u)
+{
+ posix_acl_release(u->open.op_dpacl);
+ posix_acl_release(u->open.op_pacl);
+}
+
/*
* filehandle-manipulating ops.
*/
@@ -3718,6 +3723,7 @@ static const struct nfsd4_operation nfsd4_ops[] = {
},
[OP_OPEN] = {
.op_func = nfsd4_open,
+ .op_release = nfsd4_open_release,
.op_flags = OP_HANDLES_WRONGSEC | OP_MODIFIES_SOMETHING,
.op_name = "OP_OPEN",
.op_rsize_bop = nfsd4_open_rsize,
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 5/6] nfsd: release OPEN-decoded posix ACLs via op_release
2026-05-31 12:07 ` [PATCH 5/6] nfsd: release OPEN-decoded posix ACLs via op_release Jeff Layton
@ 2026-05-31 15:54 ` Chuck Lever
2026-05-31 18:42 ` Jeff Layton
0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2026-05-31 15:54 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey, David Howells, Alexander Viro, Rick Macklem,
Chris Mason
Cc: linux-nfs, linux-kernel
On Sun, May 31, 2026, at 8:07 AM, Jeff Layton wrote:
> From: Chris Mason <clm@meta.com>
> Remove the matching posix_acl_release() pair from nfsd4_open()'s
> out_err: label: the compound dispatcher calls op_release
> unconditionally after every op, [...]
The double-free fix is right, but op_release is not called
unconditionally after every op, and that gap leaks the ACLs on the
v4.0 replay path.
op_release runs only at the release: label inside
nfsd4_encode_operation(). The compound loop skips that encoder on
a replay:
if (op->status == nfserr_replay_me) {
nfsd4_encode_replay(resp->xdr, op); /* no op_release */
...
} else {
nfsd4_encode_operation(resp, op); /* op_release here */
}
So every ACL-bearing v4.0 OPEN retransmit leaks two posix_acl refs.
Please release op->u on the replay branch too:
if (op->status == nfserr_replay_me) {
op->replay = &cstate->replay_owner->so_replay;
nfsd4_encode_replay(resp->xdr, op);
status = op->status = op->replay->rp_status;
if (op->opdesc->op_release)
op->opdesc->op_release(&op->u);
}
Let's fix the "unconditionally after every op" wording too.
I've applied the other 5 in this series, so you can just resend
this one.
--
Chuck Lever
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 5/6] nfsd: release OPEN-decoded posix ACLs via op_release
2026-05-31 15:54 ` Chuck Lever
@ 2026-05-31 18:42 ` Jeff Layton
0 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 18:42 UTC (permalink / raw)
To: Chuck Lever, Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey, David Howells, Alexander Viro, Rick Macklem,
Chris Mason
Cc: linux-nfs, linux-kernel
On Sun, 2026-05-31 at 11:54 -0400, Chuck Lever wrote:
> On Sun, May 31, 2026, at 8:07 AM, Jeff Layton wrote:
> > From: Chris Mason <clm@meta.com>
>
> > Remove the matching posix_acl_release() pair from nfsd4_open()'s
> > out_err: label: the compound dispatcher calls op_release
> > unconditionally after every op, [...]
>
> The double-free fix is right, but op_release is not called
> unconditionally after every op, and that gap leaks the ACLs on the
> v4.0 replay path.
>
> op_release runs only at the release: label inside
> nfsd4_encode_operation(). The compound loop skips that encoder on
> a replay:
>
> if (op->status == nfserr_replay_me) {
> nfsd4_encode_replay(resp->xdr, op); /* no op_release */
> ...
> } else {
> nfsd4_encode_operation(resp, op); /* op_release here */
> }
>
> So every ACL-bearing v4.0 OPEN retransmit leaks two posix_acl refs.
>
> Please release op->u on the replay branch too:
>
> if (op->status == nfserr_replay_me) {
> op->replay = &cstate->replay_owner->so_replay;
> nfsd4_encode_replay(resp->xdr, op);
> status = op->status = op->replay->rp_status;
> if (op->opdesc->op_release)
> op->opdesc->op_release(&op->u);
> }
>
> Let's fix the "unconditionally after every op" wording too.
>
> I've applied the other 5 in this series, so you can just resend
> this one.
>
Good catch, will fix.
I initially had hoped that we could rely on the fact that a v4.0 client
couldn't issue an OPEN with a POSIX ACL attribute in it, but it turns
out that knfsd doesn't gate attributes by minorversion.
Maybe it should? That's a separate changeset though if so.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 6/6] nfsd: fix layout fence worker double-reference race
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
` (4 preceding siblings ...)
2026-05-31 12:07 ` [PATCH 5/6] nfsd: release OPEN-decoded posix ACLs via op_release Jeff Layton
@ 2026-05-31 12:07 ` Jeff Layton
2026-05-31 15:53 ` [PATCH 0/6] nfsd: medium-severity bugfixes Chuck Lever
6 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2026-05-31 12:07 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David Howells, Al Viro, Rick Macklem, Chris Mason
Cc: linux-nfs, linux-kernel, Jeff Layton
The workqueue core clears WORK_STRUCT_PENDING before the callback
is invoked, so delayed_work_pending() in lm_breaker_timedout() can
return false while the fence worker is already running. This lets
the breaker take a duplicate sc_count reference and schedule a new
worker that coalesces with the in-progress one. The extra reference
is never put, leaking the layout stateid.
Replace the racy delayed_work_pending() check with an
ls_fence_inflight boolean set atomically with
refcount_inc_not_zero() under ls_lock, and cleared under ls_lock
before nfs4_put_stid() on every exit path. Remove the self-rearm
mod_delayed_work() at the top of the worker.
Fixes: f52792f484ba ("NFSD: Enforce timeout on layout recall and integrate lease manager fencing")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4layouts.c | 27 +++++++++++++++------------
fs/nfsd/state.h | 1 +
2 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 6c4e4fdd6c05..475246c0e20c 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -260,6 +260,7 @@ nfsd4_alloc_layout_stateid(struct nfsd4_compound_state *cstate,
}
ls->ls_fenced = false;
+ ls->ls_fence_inflight = false;
ls->ls_fence_delay = 0;
INIT_DELAYED_WORK(&ls->ls_fence_work, nfsd4_layout_fence_worker);
@@ -798,15 +799,6 @@ nfsd4_layout_fence_worker(struct work_struct *work)
struct nfs4_client *clp;
struct nfsd_net *nn;
- /*
- * The workqueue clears WORK_STRUCT_PENDING before invoking
- * this callback. Re-arm immediately so that
- * delayed_work_pending() returns true while the fence
- * operation is in progress, preventing
- * lm_breaker_timedout() from taking a duplicate reference.
- */
- mod_delayed_work(system_dfl_wq, &ls->ls_fence_work, 0);
-
spin_lock(&ls->ls_lock);
if (list_empty(&ls->ls_layouts)) {
spin_unlock(&ls->ls_lock);
@@ -816,6 +808,9 @@ nfsd4_layout_fence_worker(struct work_struct *work)
nfsd4_close_layout(ls);
ls->ls_fenced = true;
+ spin_lock(&ls->ls_lock);
+ ls->ls_fence_inflight = false;
+ spin_unlock(&ls->ls_lock);
nfs4_put_stid(&ls->ls_stid);
return;
}
@@ -901,18 +896,26 @@ nfsd4_layout_lm_breaker_timedout(struct file_lease *fl)
if ((!nfsd4_layout_ops[ls->ls_layout_type]->fence_client) ||
ls->ls_fenced)
return true;
- if (delayed_work_pending(&ls->ls_fence_work))
- return false;
/*
* Make sure layout has not been returned yet before
- * taking a reference count on the layout stateid.
+ * taking a reference count on the layout stateid. The
+ * ls_fence_inflight flag is set together with the sc_count
+ * increment under ls_lock so that a fence worker invocation
+ * already in progress (which has cleared WORK_STRUCT_PENDING
+ * but not yet reached dispose:) cannot be coalesced with a
+ * fresh schedule that takes an extra unmatched reference.
*/
spin_lock(&ls->ls_lock);
+ if (ls->ls_fence_inflight) {
+ spin_unlock(&ls->ls_lock);
+ return false;
+ }
if (list_empty(&ls->ls_layouts) ||
!refcount_inc_not_zero(&ls->ls_stid.sc_count)) {
spin_unlock(&ls->ls_lock);
return true;
}
+ ls->ls_fence_inflight = true;
spin_unlock(&ls->ls_lock);
mod_delayed_work(system_dfl_wq, &ls->ls_fence_work, 0);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index c26b2384d694..05b6f12040d8 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -812,6 +812,7 @@ struct nfs4_layout_stateid {
struct delayed_work ls_fence_work;
unsigned int ls_fence_delay;
bool ls_fenced;
+ bool ls_fence_inflight;
};
static inline struct nfs4_layout_stateid *layoutstateid(struct nfs4_stid *s)
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 0/6] nfsd: medium-severity bugfixes
2026-05-31 12:06 [PATCH 0/6] nfsd: medium-severity bugfixes Jeff Layton
` (5 preceding siblings ...)
2026-05-31 12:07 ` [PATCH 6/6] nfsd: fix layout fence worker double-reference race Jeff Layton
@ 2026-05-31 15:53 ` Chuck Lever
6 siblings, 0 replies; 15+ messages in thread
From: Chuck Lever @ 2026-05-31 15:53 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey, David Howells,
Al Viro, Rick Macklem, Chris Mason, Jeff Layton
Cc: Chuck Lever, linux-nfs, linux-kernel
From: Chuck Lever <chuck.lever@oracle.com>
On Sun, 31 May 2026 08:06:57 -0400, Jeff Layton wrote:
> These are classified as medium-severity, remote-triggerable. Some
> memory leaks and refcounting bugs, and a potential buffer overrun in a
> tracepoint.
Applied to nfsd-testing, thanks!
[1/6] nfsd: size fh_verify server sockaddr slot by xpt_locallen
commit: bbbc993f3932be9428ebf81009d0d1a4aed26d00
[2/6] nfsd: release path refs on follow_down() error
commit: 41c9073b511fe70d5e3b03f039c409d2060de06b
[3/6] nfsd: fix nfsd_file leak on inter-server COPY setup failure
commit: b55429cfce784ffcbaed2960d66b7c5db90ceec3
[4/6] nfsd: fix dentry ref leak on V4ROOT export filehandle lookup
commit: 657c7e3082589c35695f77821f1dfa31bf4c6a34
[6/6] nfsd: fix layout fence worker double-reference race
commit: 4faa69d7158a4436d340b7324aa34fa672362b5c
--
Chuck Lever <chuck.lever@oracle.com>
^ permalink raw reply [flat|nested] 15+ messages in thread