* [PATCH] cachefiles: Fix excess dput() after end_removing()
@ 2026-03-24 22:35 David Howells
2026-03-24 22:50 ` David Howells
2026-03-26 10:15 ` [PATCH v2] " David Howells
0 siblings, 2 replies; 6+ messages in thread
From: David Howells @ 2026-03-24 22:35 UTC (permalink / raw)
To: NeilBrown
Cc: dhowells, Marc Dionne, Paulo Alcantara, Christian Brauner, netfs,
linux-afs, linux-fsdevel, linux-kernel
When cachefiles_cull() calls cachefiles_bury_object(), the latter eats the
former's ref on the victim dentry that it obtained from
cachefiles_lookup_for_cull(). However, commit 7bb1eb45e43c left the dput
of the victim in place, resulting in occasional:
WARNING: fs/dcache.c:829 at dput.part.0+0xf5/0x110, CPU#7: cachefilesd/11831
cachefiles_cull+0x8c/0xe0 [cachefiles]
cachefiles_daemon_cull+0xcd/0x120 [cachefiles]
cachefiles_daemon_write+0x14e/0x1d0 [cachefiles]
vfs_write+0xc3/0x480
...
reports.
Fix this by removing the dput().
Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: NeilBrown <neil@brown.name>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index bdac2f33edf3..e2023e78e4df 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -795,7 +795,6 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
ret = cachefiles_bury_object(cache, NULL, dir, victim,
FSCACHE_OBJECT_WAS_CULLED);
- dput(victim);
if (ret < 0)
goto error;
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] cachefiles: Fix excess dput() after end_removing() 2026-03-24 22:35 [PATCH] cachefiles: Fix excess dput() after end_removing() David Howells @ 2026-03-24 22:50 ` David Howells 2026-03-25 12:57 ` Marc Dionne 2026-03-26 10:15 ` [PATCH v2] " David Howells 1 sibling, 1 reply; 6+ messages in thread From: David Howells @ 2026-03-24 22:50 UTC (permalink / raw) Cc: dhowells, NeilBrown, Marc Dionne, Paulo Alcantara, Christian Brauner, netfs, linux-afs, linux-fsdevel, linux-kernel David Howells <dhowells@redhat.com> wrote: > Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()") Actually, this should probably be: Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()") ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing() 2026-03-24 22:50 ` David Howells @ 2026-03-25 12:57 ` Marc Dionne 2026-03-26 7:51 ` David Howells 0 siblings, 1 reply; 6+ messages in thread From: Marc Dionne @ 2026-03-25 12:57 UTC (permalink / raw) To: David Howells Cc: NeilBrown, Paulo Alcantara, Christian Brauner, netfs, linux-afs, linux-fsdevel, linux-kernel On Tue, Mar 24, 2026 at 7:50 PM David Howells <dhowells@redhat.com> wrote: > > David Howells <dhowells@redhat.com> wrote: > > > Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()") > > Actually, this should probably be: > > Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()") I think it is the correct Fixes tag, but I'm not sure that this is actually the right fix. 7bb1eb45e43c switched other callers of cachefiles_bury_object to use start_removing_dentry, which gets an additional ref, and removed the extra dget from cachefiles_bury_object. In the cachefiles_cull case however, the dentry is from start_removing and has a single ref on entry to cachefiles_bury_object, which is an issue as "rep" may be used there after end_removing may have put the last ref. So the correct is probably for cachefiles_cull to add a dget() before the call to cachefiles_bury_object. Marc ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing() 2026-03-25 12:57 ` Marc Dionne @ 2026-03-26 7:51 ` David Howells 2026-03-26 9:07 ` NeilBrown 0 siblings, 1 reply; 6+ messages in thread From: David Howells @ 2026-03-26 7:51 UTC (permalink / raw) To: Marc Dionne Cc: dhowells, NeilBrown, Paulo Alcantara, Christian Brauner, netfs, linux-afs, linux-fsdevel, linux-kernel Marc Dionne <marc.c.dionne@gmail.com> wrote: > I think it is the correct Fixes tag, but I'm not sure that this is > actually the right fix. 7bb1eb45e43c switched other callers of > cachefiles_bury_object to use start_removing_dentry, which gets an > additional ref, and removed the extra dget from > cachefiles_bury_object. In the cachefiles_cull case however, the > dentry is from start_removing and has a single ref on entry to > cachefiles_bury_object, which is an issue as "rep" may be used there > after end_removing may have put the last ref. So the correct is > probably for cachefiles_cull to add a dget() before the call to > cachefiles_bury_object. Ugh. You're right. The problem is that we're calling start_removing() without knowing whether we can just unlink the object. I wonder if I need to do the lookup in cachefiles_lookup_for_cull() and only then call start_removing_dentry() if it's not a directory (directories get moved to the graveyard for cachefilesd to tear down). I think the right solution is actually to move start_removing_dentry() down into cachefiles_bury_object() and make it contingent on the dentry being a non-dir. David ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachefiles: Fix excess dput() after end_removing() 2026-03-26 7:51 ` David Howells @ 2026-03-26 9:07 ` NeilBrown 0 siblings, 0 replies; 6+ messages in thread From: NeilBrown @ 2026-03-26 9:07 UTC (permalink / raw) To: David Howells Cc: Marc Dionne, dhowells, Paulo Alcantara, Christian Brauner, netfs, linux-afs, linux-fsdevel, linux-kernel On Thu, 26 Mar 2026, David Howells wrote: > Marc Dionne <marc.c.dionne@gmail.com> wrote: > > > I think it is the correct Fixes tag, but I'm not sure that this is > > actually the right fix. 7bb1eb45e43c switched other callers of > > cachefiles_bury_object to use start_removing_dentry, which gets an > > additional ref, and removed the extra dget from > > cachefiles_bury_object. In the cachefiles_cull case however, the > > dentry is from start_removing and has a single ref on entry to > > cachefiles_bury_object, which is an issue as "rep" may be used there > > after end_removing may have put the last ref. So the correct is > > probably for cachefiles_cull to add a dget() before the call to > > cachefiles_bury_object. > > Ugh. You're right. > > The problem is that we're calling start_removing() without knowing whether we > can just unlink the object. I wonder if I need to do the lookup in > cachefiles_lookup_for_cull() and only then call start_removing_dentry() if > it's not a directory (directories get moved to the graveyard for cachefilesd > to tear down). > > I think the right solution is actually to move start_removing_dentry() down > into cachefiles_bury_object() and make it contingent on the dentry being a > non-dir. > > David > > cachesfiles_bury_object() has a comment saying: * On entry there must be at least 2 refs on rep, one will be dropped on exit. and this is consistent with the code in that function. It is called from 3 places. - cachefiles_invalidate_cookie(), cachesfiles_look_up_object(), and cachefiles_acquire_volume() all precede it with a start_removing_dentry() which results in 2 references to the dentry (the original and and extra which it takes) - so that fits with the comment. - cachesfiles_cull() preceeds it with cachesfiles_lookup_for_cull() which uses start_removing() which returns with 1 reference to the dentry. As the dentry didn't pre-exist, there is only one ref. So this is incorrect. cachesfiles_cull() needs to take an extra reference to victim so that when cachefiles_busy_object() calls end_removing, it still has a valid reference. So I think --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -781,7 +781,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, if (ret < 0) goto error_unlock; - ret = cachefiles_bury_object(cache, NULL, dir, victim, + ret = cachefiles_bury_object(cache, NULL, dir, dget(victim), FSCACHE_OBJECT_WAS_CULLED); dput(victim); if (ret < 0) would be a correct fix. If you agree I can post a properly formated patch which explanation. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] cachefiles: Fix excess dput() after end_removing() 2026-03-24 22:35 [PATCH] cachefiles: Fix excess dput() after end_removing() David Howells 2026-03-24 22:50 ` David Howells @ 2026-03-26 10:15 ` David Howells 1 sibling, 0 replies; 6+ messages in thread From: David Howells @ 2026-03-26 10:15 UTC (permalink / raw) To: NeilBrown Cc: dhowells, Marc Dionne, Paulo Alcantara, Christian Brauner, netfs, linux-afs, linux-fsdevel, linux-kernel When cachefiles_cull() calls cachefiles_bury_object(), the latter eats the former's ref on the victim dentry that it obtained from cachefiles_lookup_for_cull(). However, commit 7bb1eb45e43c left the dput of the victim in place, resulting in occasional: WARNING: fs/dcache.c:829 at dput.part.0+0xf5/0x110, CPU#7: cachefilesd/11831 cachefiles_cull+0x8c/0xe0 [cachefiles] cachefiles_daemon_cull+0xcd/0x120 [cachefiles] cachefiles_daemon_write+0x14e/0x1d0 [cachefiles] vfs_write+0xc3/0x480 ... reports. Actually, it's worse than that: cachefiles_bury_object() eats the ref it was given - and then may continue to access the now-unref'd dentry it if it turns out to be a directory. So simply removing the aberrant dput() is not sufficient. Fix this by making cachefiles_bury_object() retain the ref itself around end_removing() if it needs to keep it and then drop the ref before returning. Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: NeilBrown <neil@brown.name> cc: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org --- fs/cachefiles/namei.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index e5ec90dccc27..20138309733f 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -287,14 +287,14 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, if (!d_is_dir(rep)) { ret = cachefiles_unlink(cache, object, dir, rep, why); end_removing(rep); - _leave(" = %d", ret); return ret; } /* directories have to be moved to the graveyard */ _debug("move stale object to graveyard"); - end_removing(rep); + dget(rep); + end_removing(rep); /* Drops ref on rep */ try_again: /* first step is to make up a grave dentry in the graveyard */ @@ -304,8 +304,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, /* do the multiway lock magic */ trap = lock_rename(cache->graveyard, dir); - if (IS_ERR(trap)) - return PTR_ERR(trap); + if (IS_ERR(trap)) { + ret = PTR_ERR(trap); + goto out; + } /* do some checks before getting the grave dentry */ if (rep->d_parent != dir || IS_DEADDIR(d_inode(rep))) { @@ -313,25 +315,27 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, * lock */ unlock_rename(cache->graveyard, dir); _leave(" = 0 [culled?]"); - return 0; + ret = 0; + goto out; } + ret = -EIO; if (!d_can_lookup(cache->graveyard)) { unlock_rename(cache->graveyard, dir); cachefiles_io_error(cache, "Graveyard no longer a directory"); - return -EIO; + goto out; } if (trap == rep) { unlock_rename(cache->graveyard, dir); cachefiles_io_error(cache, "May not make directory loop"); - return -EIO; + goto out; } if (d_mountpoint(rep)) { unlock_rename(cache->graveyard, dir); cachefiles_io_error(cache, "Mountpoint in cache"); - return -EIO; + goto out; } grave = lookup_one(&nop_mnt_idmap, &QSTR(nbuffer), cache->graveyard); @@ -343,11 +347,12 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, if (PTR_ERR(grave) == -ENOMEM) { _leave(" = -ENOMEM"); - return -ENOMEM; + ret = -ENOMEM; + goto out; } cachefiles_io_error(cache, "Lookup error %ld", PTR_ERR(grave)); - return -EIO; + goto out; } if (d_is_positive(grave)) { @@ -362,7 +367,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, unlock_rename(cache->graveyard, dir); dput(grave); cachefiles_io_error(cache, "Mountpoint in graveyard"); - return -EIO; + goto out; } /* target should not be an ancestor of source */ @@ -370,7 +375,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, unlock_rename(cache->graveyard, dir); dput(grave); cachefiles_io_error(cache, "May not make directory loop"); - return -EIO; + goto out; } /* attempt the rename */ @@ -404,8 +409,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache, __cachefiles_unmark_inode_in_use(object, d_inode(rep)); unlock_rename(cache->graveyard, dir); dput(grave); - _leave(" = 0"); - return 0; + _leave(" = %d", ret); +out: + dput(rep); + return ret; } /* @@ -812,7 +819,6 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, ret = cachefiles_bury_object(cache, NULL, dir, victim, FSCACHE_OBJECT_WAS_CULLED); - dput(victim); if (ret < 0) goto error; ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-26 10:15 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-24 22:35 [PATCH] cachefiles: Fix excess dput() after end_removing() David Howells 2026-03-24 22:50 ` David Howells 2026-03-25 12:57 ` Marc Dionne 2026-03-26 7:51 ` David Howells 2026-03-26 9:07 ` NeilBrown 2026-03-26 10:15 ` [PATCH v2] " David Howells
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox