* [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg()
@ 2025-12-01 22:09 Chuck Lever
2025-12-02 12:08 ` Jeff Layton
0 siblings, 1 reply; 5+ messages in thread
From: Chuck Lever @ 2025-12-01 22:09 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites
fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if
the file already has a READ open from a previous OPEN operation,
this overwrites the existing pointer without releasing its reference,
orphaning the previous reference.
Additionally, the function originally stored the same nfsd_file
pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with
only a single reference. When put_deleg_file() runs, it clears
fi_rdeleg_file and calls nfs4_file_put_access() to release the file.
However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when
the fi_access[O_RDONLY] counter drops to zero. If another READ open
exists on the file, the counter remains elevated and the nfsd_file
reference from the delegation is never released. This potentially
causes open conflicts on that file.
But, on server shutdown, these leaks cause __nfsd_file_cache_purge()
to encounter files with an elevated reference count that cannot be
cleaned up, ultimately triggering a BUG() in kmem_cache_destroy()
because there are still nfsd_file objects allocated in that cache.
Fixes: e7a8ebc305f2 ("NFSD: Offer write deleg for OPEN4_SHARE_ACCESS_WRITE")
X-Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4state.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 35004568d43e..11877b96dc4c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1218,8 +1218,10 @@ static void put_deleg_file(struct nfs4_file *fp)
if (nf)
nfsd_file_put(nf);
- if (rnf)
+ if (rnf) {
+ nfsd_file_put(rnf);
nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ);
+ }
}
static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f)
@@ -6231,10 +6233,14 @@ nfsd4_add_rdaccess_to_wrdeleg(struct svc_rqst *rqstp, struct nfsd4_open *open,
fp = stp->st_stid.sc_file;
spin_lock(&fp->fi_lock);
__nfs4_file_get_access(fp, NFS4_SHARE_ACCESS_READ);
- fp = stp->st_stid.sc_file;
- fp->fi_fds[O_RDONLY] = nf;
- fp->fi_rdeleg_file = nf;
+ if (!fp->fi_fds[O_RDONLY]) {
+ fp->fi_fds[O_RDONLY] = nf;
+ nf = NULL;
+ }
+ fp->fi_rdeleg_file = nfsd_file_get(fp->fi_fds[O_RDONLY]);
spin_unlock(&fp->fi_lock);
+ if (nf)
+ nfsd_file_put(nf);
}
return true;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() 2025-12-01 22:09 [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() Chuck Lever @ 2025-12-02 12:08 ` Jeff Layton 2025-12-02 13:34 ` Chuck Lever 0 siblings, 1 reply; 5+ messages in thread From: Jeff Layton @ 2025-12-02 12:08 UTC (permalink / raw) To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey Cc: linux-nfs, Chuck Lever On Mon, 2025-12-01 at 17:09 -0500, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites > fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if > the file already has a READ open from a previous OPEN operation, > this overwrites the existing pointer without releasing its reference, > orphaning the previous reference. > > Additionally, the function originally stored the same nfsd_file > pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with > only a single reference. When put_deleg_file() runs, it clears > fi_rdeleg_file and calls nfs4_file_put_access() to release the file. > > However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when > the fi_access[O_RDONLY] counter drops to zero. If another READ open > exists on the file, the counter remains elevated and the nfsd_file > reference from the delegation is never released. This potentially > causes open conflicts on that file. > > But, on server shutdown, these leaks cause __nfsd_file_cache_purge() > to encounter files with an elevated reference count that cannot be > cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() > because there are still nfsd_file objects allocated in that cache. > > Fixes: e7a8ebc305f2 ("NFSD: Offer write deleg for OPEN4_SHARE_ACCESS_WRITE") > X-Cc: stable@vger.kernel.org > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > fs/nfsd/nfs4state.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 35004568d43e..11877b96dc4c 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -1218,8 +1218,10 @@ static void put_deleg_file(struct nfs4_file *fp) > > if (nf) > nfsd_file_put(nf); > - if (rnf) > + if (rnf) { > + nfsd_file_put(rnf); > nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ); > + } > } > > static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) > @@ -6231,10 +6233,14 @@ nfsd4_add_rdaccess_to_wrdeleg(struct svc_rqst *rqstp, struct nfsd4_open *open, > fp = stp->st_stid.sc_file; > spin_lock(&fp->fi_lock); > __nfs4_file_get_access(fp, NFS4_SHARE_ACCESS_READ); > - fp = stp->st_stid.sc_file; Weird. Just noticed the double assignment here. > - fp->fi_fds[O_RDONLY] = nf; > - fp->fi_rdeleg_file = nf; > + if (!fp->fi_fds[O_RDONLY]) { > + fp->fi_fds[O_RDONLY] = nf; > + nf = NULL; > + } > + fp->fi_rdeleg_file = nfsd_file_get(fp->fi_fds[O_RDONLY]); > spin_unlock(&fp->fi_lock); > + if (nf) > + nfsd_file_put(nf); > } > return true; > } I do so wish this refcounting were easier to get right, but I don't have any great ideas around it yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() 2025-12-02 12:08 ` Jeff Layton @ 2025-12-02 13:34 ` Chuck Lever 2025-12-02 14:53 ` Jeff Layton 0 siblings, 1 reply; 5+ messages in thread From: Chuck Lever @ 2025-12-02 13:34 UTC (permalink / raw) To: Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey Cc: linux-nfs, Chuck Lever On 12/2/25 7:08 AM, Jeff Layton wrote: > On Mon, 2025-12-01 at 17:09 -0500, Chuck Lever wrote: >> From: Chuck Lever <chuck.lever@oracle.com> >> >> nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites >> fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if >> the file already has a READ open from a previous OPEN operation, >> this overwrites the existing pointer without releasing its reference, >> orphaning the previous reference. >> >> Additionally, the function originally stored the same nfsd_file >> pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with >> only a single reference. When put_deleg_file() runs, it clears >> fi_rdeleg_file and calls nfs4_file_put_access() to release the file. >> >> However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when >> the fi_access[O_RDONLY] counter drops to zero. If another READ open >> exists on the file, the counter remains elevated and the nfsd_file >> reference from the delegation is never released. This potentially >> causes open conflicts on that file. >> >> But, on server shutdown, these leaks cause __nfsd_file_cache_purge() >> to encounter files with an elevated reference count that cannot be >> cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() >> because there are still nfsd_file objects allocated in that cache. >> >> Fixes: e7a8ebc305f2 ("NFSD: Offer write deleg for OPEN4_SHARE_ACCESS_WRITE") >> X-Cc: stable@vger.kernel.org >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >> --- >> fs/nfsd/nfs4state.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index 35004568d43e..11877b96dc4c 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -1218,8 +1218,10 @@ static void put_deleg_file(struct nfs4_file *fp) >> >> if (nf) >> nfsd_file_put(nf); >> - if (rnf) >> + if (rnf) { >> + nfsd_file_put(rnf); >> nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ); >> + } >> } >> >> static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) >> @@ -6231,10 +6233,14 @@ nfsd4_add_rdaccess_to_wrdeleg(struct svc_rqst *rqstp, struct nfsd4_open *open, >> fp = stp->st_stid.sc_file; >> spin_lock(&fp->fi_lock); >> __nfs4_file_get_access(fp, NFS4_SHARE_ACCESS_READ); >> - fp = stp->st_stid.sc_file; > > Weird. Just noticed the double assignment here. > >> - fp->fi_fds[O_RDONLY] = nf; >> - fp->fi_rdeleg_file = nf; >> + if (!fp->fi_fds[O_RDONLY]) { >> + fp->fi_fds[O_RDONLY] = nf; >> + nf = NULL; >> + } >> + fp->fi_rdeleg_file = nfsd_file_get(fp->fi_fds[O_RDONLY]); >> spin_unlock(&fp->fi_lock); >> + if (nf) >> + nfsd_file_put(nf); >> } >> return true; >> } > > I do so wish this refcounting were easier to get right, but I don't > have any great ideas around it yet. > > Reviewed-by: Jeff Layton <jlayton@kernel.org> Thanks for the R-b. Chris's review-prompts do generic navigation of reference counting, so we have a little more of a back-stop now. I ran the review-prompts against e7a8ebc305f2 on a lark, and they indeed found this problem. -- Chuck Lever ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() 2025-12-02 13:34 ` Chuck Lever @ 2025-12-02 14:53 ` Jeff Layton 2025-12-02 19:13 ` Chuck Lever 0 siblings, 1 reply; 5+ messages in thread From: Jeff Layton @ 2025-12-02 14:53 UTC (permalink / raw) To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey Cc: linux-nfs, Chuck Lever On Tue, 2025-12-02 at 08:34 -0500, Chuck Lever wrote: > On 12/2/25 7:08 AM, Jeff Layton wrote: > > On Mon, 2025-12-01 at 17:09 -0500, Chuck Lever wrote: > > > From: Chuck Lever <chuck.lever@oracle.com> > > > > > > nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites > > > fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if > > > the file already has a READ open from a previous OPEN operation, > > > this overwrites the existing pointer without releasing its reference, > > > orphaning the previous reference. > > > > > > Additionally, the function originally stored the same nfsd_file > > > pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with > > > only a single reference. When put_deleg_file() runs, it clears > > > fi_rdeleg_file and calls nfs4_file_put_access() to release the file. > > > > > > However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when > > > the fi_access[O_RDONLY] counter drops to zero. If another READ open > > > exists on the file, the counter remains elevated and the nfsd_file > > > reference from the delegation is never released. This potentially > > > causes open conflicts on that file. > > > > > > But, on server shutdown, these leaks cause __nfsd_file_cache_purge() > > > to encounter files with an elevated reference count that cannot be > > > cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() > > > because there are still nfsd_file objects allocated in that cache. > > > > > > Fixes: e7a8ebc305f2 ("NFSD: Offer write deleg for OPEN4_SHARE_ACCESS_WRITE") > > > X-Cc: stable@vger.kernel.org > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > > > --- > > > fs/nfsd/nfs4state.c | 14 ++++++++++---- > > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > index 35004568d43e..11877b96dc4c 100644 > > > --- a/fs/nfsd/nfs4state.c > > > +++ b/fs/nfsd/nfs4state.c > > > @@ -1218,8 +1218,10 @@ static void put_deleg_file(struct nfs4_file *fp) > > > > > > if (nf) > > > nfsd_file_put(nf); > > > - if (rnf) > > > + if (rnf) { > > > + nfsd_file_put(rnf); > > > nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ); > > > + } > > > } > > > > > > static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) > > > @@ -6231,10 +6233,14 @@ nfsd4_add_rdaccess_to_wrdeleg(struct svc_rqst *rqstp, struct nfsd4_open *open, > > > fp = stp->st_stid.sc_file; > > > spin_lock(&fp->fi_lock); > > > __nfs4_file_get_access(fp, NFS4_SHARE_ACCESS_READ); > > > - fp = stp->st_stid.sc_file; > > > > Weird. Just noticed the double assignment here. > > > > > - fp->fi_fds[O_RDONLY] = nf; > > > - fp->fi_rdeleg_file = nf; > > > + if (!fp->fi_fds[O_RDONLY]) { > > > + fp->fi_fds[O_RDONLY] = nf; > > > + nf = NULL; > > > + } > > > + fp->fi_rdeleg_file = nfsd_file_get(fp->fi_fds[O_RDONLY]); > > > spin_unlock(&fp->fi_lock); > > > + if (nf) > > > + nfsd_file_put(nf); > > > } > > > return true; > > > } > > > > I do so wish this refcounting were easier to get right, but I don't > > have any great ideas around it yet. > > > > Reviewed-by: Jeff Layton <jlayton@kernel.org> > > Thanks for the R-b. Chris's review-prompts do generic navigation of > reference counting, so we have a little more of a back-stop now. I ran > the review-prompts against e7a8ebc305f2 on a lark, and they indeed found > this problem. > Now that I look again, I'm wondering -- is this problem possible? nfsd4_add_rdaccess_to_wrdeleg() is called after we have set a write lease on the file. There should be no other open files at that point, so fi_fds[O_RDONLY] must be NULL already. Instead of or in addition to doing this, maybe we should be doing a WARN_ON_ONCE() if fp->fi_fds[O_RDONLY] is non-NULL here? -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() 2025-12-02 14:53 ` Jeff Layton @ 2025-12-02 19:13 ` Chuck Lever 0 siblings, 0 replies; 5+ messages in thread From: Chuck Lever @ 2025-12-02 19:13 UTC (permalink / raw) To: Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey Cc: linux-nfs, Chuck Lever On Tue, Dec 2, 2025, at 9:53 AM, Jeff Layton wrote: > On Tue, 2025-12-02 at 08:34 -0500, Chuck Lever wrote: >> On 12/2/25 7:08 AM, Jeff Layton wrote: >> > On Mon, 2025-12-01 at 17:09 -0500, Chuck Lever wrote: >> > > From: Chuck Lever <chuck.lever@oracle.com> >> > > >> > > nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites >> > > fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if >> > > the file already has a READ open from a previous OPEN operation, >> > > this overwrites the existing pointer without releasing its reference, >> > > orphaning the previous reference. >> > > >> > > Additionally, the function originally stored the same nfsd_file >> > > pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with >> > > only a single reference. When put_deleg_file() runs, it clears >> > > fi_rdeleg_file and calls nfs4_file_put_access() to release the file. >> > > >> > > However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when >> > > the fi_access[O_RDONLY] counter drops to zero. If another READ open >> > > exists on the file, the counter remains elevated and the nfsd_file >> > > reference from the delegation is never released. This potentially >> > > causes open conflicts on that file. >> > > >> > > But, on server shutdown, these leaks cause __nfsd_file_cache_purge() >> > > to encounter files with an elevated reference count that cannot be >> > > cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() >> > > because there are still nfsd_file objects allocated in that cache. >> > > >> > > Fixes: e7a8ebc305f2 ("NFSD: Offer write deleg for OPEN4_SHARE_ACCESS_WRITE") >> > > X-Cc: stable@vger.kernel.org >> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >> > > --- >> > > fs/nfsd/nfs4state.c | 14 ++++++++++---- >> > > 1 file changed, 10 insertions(+), 4 deletions(-) >> > > >> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> > > index 35004568d43e..11877b96dc4c 100644 >> > > --- a/fs/nfsd/nfs4state.c >> > > +++ b/fs/nfsd/nfs4state.c >> > > @@ -1218,8 +1218,10 @@ static void put_deleg_file(struct nfs4_file *fp) >> > > >> > > if (nf) >> > > nfsd_file_put(nf); >> > > - if (rnf) >> > > + if (rnf) { >> > > + nfsd_file_put(rnf); >> > > nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ); >> > > + } >> > > } >> > > >> > > static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) >> > > @@ -6231,10 +6233,14 @@ nfsd4_add_rdaccess_to_wrdeleg(struct svc_rqst *rqstp, struct nfsd4_open *open, >> > > fp = stp->st_stid.sc_file; >> > > spin_lock(&fp->fi_lock); >> > > __nfs4_file_get_access(fp, NFS4_SHARE_ACCESS_READ); >> > > - fp = stp->st_stid.sc_file; >> > >> > Weird. Just noticed the double assignment here. >> > >> > > - fp->fi_fds[O_RDONLY] = nf; >> > > - fp->fi_rdeleg_file = nf; >> > > + if (!fp->fi_fds[O_RDONLY]) { >> > > + fp->fi_fds[O_RDONLY] = nf; >> > > + nf = NULL; >> > > + } >> > > + fp->fi_rdeleg_file = nfsd_file_get(fp->fi_fds[O_RDONLY]); >> > > spin_unlock(&fp->fi_lock); >> > > + if (nf) >> > > + nfsd_file_put(nf); >> > > } >> > > return true; >> > > } >> > >> > I do so wish this refcounting were easier to get right, but I don't >> > have any great ideas around it yet. >> > >> > Reviewed-by: Jeff Layton <jlayton@kernel.org> >> >> Thanks for the R-b. Chris's review-prompts do generic navigation of >> reference counting, so we have a little more of a back-stop now. I ran >> the review-prompts against e7a8ebc305f2 on a lark, and they indeed found >> this problem. >> > > Now that I look again, I'm wondering -- is this problem possible? It is. The server does crash during shutdown as described above, and this patch makes the crashes stop. > nfsd4_add_rdaccess_to_wrdeleg() is called after we have set a write > lease on the file. There should be no other open files at that point, > so fi_fds[O_RDONLY] must be NULL already. > > Instead of or in addition to doing this, maybe we should be doing a > WARN_ON_ONCE() if fp->fi_fds[O_RDONLY] is non-NULL here? A WARN_ON(1) will help me nail down exactly what part of xfstests is triggering the problem. Stay tuned. -- Chuck Lever ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-12-02 19:13 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-01 22:09 [PATCH v1] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() Chuck Lever 2025-12-02 12:08 ` Jeff Layton 2025-12-02 13:34 ` Chuck Lever 2025-12-02 14:53 ` Jeff Layton 2025-12-02 19:13 ` Chuck Lever
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox