* [PATCH] cachefiles: Fix excess dput() after end_removing()
@ 2026-03-24 22:35 David Howells
2026-03-24 22:50 ` David Howells
2026-03-26 10:15 ` [PATCH v2] " David Howells
0 siblings, 2 replies; 6+ messages in thread
From: David Howells @ 2026-03-24 22:35 UTC (permalink / raw)
To: NeilBrown
Cc: dhowells, Marc Dionne, Paulo Alcantara, Christian Brauner, netfs,
linux-afs, linux-fsdevel, linux-kernel
When cachefiles_cull() calls cachefiles_bury_object(), the latter eats the
former's ref on the victim dentry that it obtained from
cachefiles_lookup_for_cull(). However, commit 7bb1eb45e43c left the dput
of the victim in place, resulting in occasional:
WARNING: fs/dcache.c:829 at dput.part.0+0xf5/0x110, CPU#7: cachefilesd/11831
cachefiles_cull+0x8c/0xe0 [cachefiles]
cachefiles_daemon_cull+0xcd/0x120 [cachefiles]
cachefiles_daemon_write+0x14e/0x1d0 [cachefiles]
vfs_write+0xc3/0x480
...
reports.
Fix this by removing the dput().
Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: NeilBrown <neil@brown.name>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index bdac2f33edf3..e2023e78e4df 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -795,7 +795,6 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
ret = cachefiles_bury_object(cache, NULL, dir, victim,
FSCACHE_OBJECT_WAS_CULLED);
- dput(victim);
if (ret < 0)
goto error;
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing()
2026-03-24 22:35 [PATCH] cachefiles: Fix excess dput() after end_removing() David Howells
@ 2026-03-24 22:50 ` David Howells
2026-03-25 12:57 ` Marc Dionne
2026-03-26 10:15 ` [PATCH v2] " David Howells
1 sibling, 1 reply; 6+ messages in thread
From: David Howells @ 2026-03-24 22:50 UTC (permalink / raw)
Cc: dhowells, NeilBrown, Marc Dionne, Paulo Alcantara,
Christian Brauner, netfs, linux-afs, linux-fsdevel, linux-kernel
David Howells <dhowells@redhat.com> wrote:
> Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
Actually, this should probably be:
Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing()
2026-03-24 22:50 ` David Howells
@ 2026-03-25 12:57 ` Marc Dionne
2026-03-26 7:51 ` David Howells
0 siblings, 1 reply; 6+ messages in thread
From: Marc Dionne @ 2026-03-25 12:57 UTC (permalink / raw)
To: David Howells
Cc: NeilBrown, Paulo Alcantara, Christian Brauner, netfs, linux-afs,
linux-fsdevel, linux-kernel
On Tue, Mar 24, 2026 at 7:50 PM David Howells <dhowells@redhat.com> wrote:
>
> David Howells <dhowells@redhat.com> wrote:
>
> > Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
>
> Actually, this should probably be:
>
> Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
I think it is the correct Fixes tag, but I'm not sure that this is
actually the right fix. 7bb1eb45e43c switched other callers of
cachefiles_bury_object to use start_removing_dentry, which gets an
additional ref, and removed the extra dget from
cachefiles_bury_object. In the cachefiles_cull case however, the
dentry is from start_removing and has a single ref on entry to
cachefiles_bury_object, which is an issue as "rep" may be used there
after end_removing may have put the last ref. So the correct is
probably for cachefiles_cull to add a dget() before the call to
cachefiles_bury_object.
Marc
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing()
2026-03-25 12:57 ` Marc Dionne
@ 2026-03-26 7:51 ` David Howells
2026-03-26 9:07 ` NeilBrown
0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2026-03-26 7:51 UTC (permalink / raw)
To: Marc Dionne
Cc: dhowells, NeilBrown, Paulo Alcantara, Christian Brauner, netfs,
linux-afs, linux-fsdevel, linux-kernel
Marc Dionne <marc.c.dionne@gmail.com> wrote:
> I think it is the correct Fixes tag, but I'm not sure that this is
> actually the right fix. 7bb1eb45e43c switched other callers of
> cachefiles_bury_object to use start_removing_dentry, which gets an
> additional ref, and removed the extra dget from
> cachefiles_bury_object. In the cachefiles_cull case however, the
> dentry is from start_removing and has a single ref on entry to
> cachefiles_bury_object, which is an issue as "rep" may be used there
> after end_removing may have put the last ref. So the correct is
> probably for cachefiles_cull to add a dget() before the call to
> cachefiles_bury_object.
Ugh. You're right.
The problem is that we're calling start_removing() without knowing whether we
can just unlink the object. I wonder if I need to do the lookup in
cachefiles_lookup_for_cull() and only then call start_removing_dentry() if
it's not a directory (directories get moved to the graveyard for cachefilesd
to tear down).
I think the right solution is actually to move start_removing_dentry() down
into cachefiles_bury_object() and make it contingent on the dentry being a
non-dir.
David
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing()
2026-03-26 7:51 ` David Howells
@ 2026-03-26 9:07 ` NeilBrown
0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2026-03-26 9:07 UTC (permalink / raw)
To: David Howells
Cc: Marc Dionne, dhowells, Paulo Alcantara, Christian Brauner, netfs,
linux-afs, linux-fsdevel, linux-kernel
On Thu, 26 Mar 2026, David Howells wrote:
> Marc Dionne <marc.c.dionne@gmail.com> wrote:
>
> > I think it is the correct Fixes tag, but I'm not sure that this is
> > actually the right fix. 7bb1eb45e43c switched other callers of
> > cachefiles_bury_object to use start_removing_dentry, which gets an
> > additional ref, and removed the extra dget from
> > cachefiles_bury_object. In the cachefiles_cull case however, the
> > dentry is from start_removing and has a single ref on entry to
> > cachefiles_bury_object, which is an issue as "rep" may be used there
> > after end_removing may have put the last ref. So the correct is
> > probably for cachefiles_cull to add a dget() before the call to
> > cachefiles_bury_object.
>
> Ugh. You're right.
>
> The problem is that we're calling start_removing() without knowing whether we
> can just unlink the object. I wonder if I need to do the lookup in
> cachefiles_lookup_for_cull() and only then call start_removing_dentry() if
> it's not a directory (directories get moved to the graveyard for cachefilesd
> to tear down).
>
> I think the right solution is actually to move start_removing_dentry() down
> into cachefiles_bury_object() and make it contingent on the dentry being a
> non-dir.
>
> David
>
>
cachesfiles_bury_object() has a comment saying:
* On entry there must be at least 2 refs on rep, one will be dropped on exit.
and this is consistent with the code in that function.
It is called from 3 places.
- cachefiles_invalidate_cookie(), cachesfiles_look_up_object(), and
cachefiles_acquire_volume() all precede it with a
start_removing_dentry() which results in 2 references to the dentry
(the original and and extra which it takes) - so that fits with the
comment.
- cachesfiles_cull() preceeds it with cachesfiles_lookup_for_cull()
which uses start_removing() which returns with 1 reference to the
dentry. As the dentry didn't pre-exist, there is only one ref.
So this is incorrect.
cachesfiles_cull() needs to take an extra reference to victim so that
when cachefiles_busy_object() calls end_removing, it still has a valid
reference.
So I think
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -781,7 +781,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
if (ret < 0)
goto error_unlock;
- ret = cachefiles_bury_object(cache, NULL, dir, victim,
+ ret = cachefiles_bury_object(cache, NULL, dir, dget(victim),
FSCACHE_OBJECT_WAS_CULLED);
dput(victim);
if (ret < 0)
would be a correct fix.
If you agree I can post a properly formated patch which explanation.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] cachefiles: Fix excess dput() after end_removing()
2026-03-24 22:35 [PATCH] cachefiles: Fix excess dput() after end_removing() David Howells
2026-03-24 22:50 ` David Howells
@ 2026-03-26 10:15 ` David Howells
1 sibling, 0 replies; 6+ messages in thread
From: David Howells @ 2026-03-26 10:15 UTC (permalink / raw)
To: NeilBrown
Cc: dhowells, Marc Dionne, Paulo Alcantara, Christian Brauner, netfs,
linux-afs, linux-fsdevel, linux-kernel
When cachefiles_cull() calls cachefiles_bury_object(), the latter eats the
former's ref on the victim dentry that it obtained from
cachefiles_lookup_for_cull(). However, commit 7bb1eb45e43c left the dput
of the victim in place, resulting in occasional:
WARNING: fs/dcache.c:829 at dput.part.0+0xf5/0x110, CPU#7: cachefilesd/11831
cachefiles_cull+0x8c/0xe0 [cachefiles]
cachefiles_daemon_cull+0xcd/0x120 [cachefiles]
cachefiles_daemon_write+0x14e/0x1d0 [cachefiles]
vfs_write+0xc3/0x480
...
reports.
Actually, it's worse than that: cachefiles_bury_object() eats the ref it
was given - and then may continue to access the now-unref'd dentry it if it
turns out to be a directory. So simply removing the aberrant dput() is not
sufficient.
Fix this by making cachefiles_bury_object() retain the ref itself around
end_removing() if it needs to keep it and then drop the ref before returning.
Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: NeilBrown <neil@brown.name>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 36 +++++++++++++++++++++---------------
1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index e5ec90dccc27..20138309733f 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -287,14 +287,14 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
if (!d_is_dir(rep)) {
ret = cachefiles_unlink(cache, object, dir, rep, why);
end_removing(rep);
-
_leave(" = %d", ret);
return ret;
}
/* directories have to be moved to the graveyard */
_debug("move stale object to graveyard");
- end_removing(rep);
+ dget(rep);
+ end_removing(rep); /* Drops ref on rep */
try_again:
/* first step is to make up a grave dentry in the graveyard */
@@ -304,8 +304,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
/* do the multiway lock magic */
trap = lock_rename(cache->graveyard, dir);
- if (IS_ERR(trap))
- return PTR_ERR(trap);
+ if (IS_ERR(trap)) {
+ ret = PTR_ERR(trap);
+ goto out;
+ }
/* do some checks before getting the grave dentry */
if (rep->d_parent != dir || IS_DEADDIR(d_inode(rep))) {
@@ -313,25 +315,27 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
* lock */
unlock_rename(cache->graveyard, dir);
_leave(" = 0 [culled?]");
- return 0;
+ ret = 0;
+ goto out;
}
+ ret = -EIO;
if (!d_can_lookup(cache->graveyard)) {
unlock_rename(cache->graveyard, dir);
cachefiles_io_error(cache, "Graveyard no longer a directory");
- return -EIO;
+ goto out;
}
if (trap == rep) {
unlock_rename(cache->graveyard, dir);
cachefiles_io_error(cache, "May not make directory loop");
- return -EIO;
+ goto out;
}
if (d_mountpoint(rep)) {
unlock_rename(cache->graveyard, dir);
cachefiles_io_error(cache, "Mountpoint in cache");
- return -EIO;
+ goto out;
}
grave = lookup_one(&nop_mnt_idmap, &QSTR(nbuffer), cache->graveyard);
@@ -343,11 +347,12 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
if (PTR_ERR(grave) == -ENOMEM) {
_leave(" = -ENOMEM");
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto out;
}
cachefiles_io_error(cache, "Lookup error %ld", PTR_ERR(grave));
- return -EIO;
+ goto out;
}
if (d_is_positive(grave)) {
@@ -362,7 +367,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
unlock_rename(cache->graveyard, dir);
dput(grave);
cachefiles_io_error(cache, "Mountpoint in graveyard");
- return -EIO;
+ goto out;
}
/* target should not be an ancestor of source */
@@ -370,7 +375,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
unlock_rename(cache->graveyard, dir);
dput(grave);
cachefiles_io_error(cache, "May not make directory loop");
- return -EIO;
+ goto out;
}
/* attempt the rename */
@@ -404,8 +409,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
__cachefiles_unmark_inode_in_use(object, d_inode(rep));
unlock_rename(cache->graveyard, dir);
dput(grave);
- _leave(" = 0");
- return 0;
+ _leave(" = %d", ret);
+out:
+ dput(rep);
+ return ret;
}
/*
@@ -812,7 +819,6 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
ret = cachefiles_bury_object(cache, NULL, dir, victim,
FSCACHE_OBJECT_WAS_CULLED);
- dput(victim);
if (ret < 0)
goto error;
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-26 10:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-24 22:35 [PATCH] cachefiles: Fix excess dput() after end_removing() David Howells
2026-03-24 22:50 ` David Howells
2026-03-25 12:57 ` Marc Dionne
2026-03-26 7:51 ` David Howells
2026-03-26 9:07 ` NeilBrown
2026-03-26 10:15 ` [PATCH v2] " David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox