* [GIT PULL] fuse update for 6.19
@ 2025-12-04 8:25 Miklos Szeredi
2025-12-05 23:47 ` Linus Torvalds
2025-12-06 0:17 ` pr-tracker-bot
0 siblings, 2 replies; 14+ messages in thread
From: Miklos Szeredi @ 2025-12-04 8:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-fsdevel, linux-kernel
Hi Linus,
Please pull from:
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git
tags/fuse-update-6.19
- Add mechanism for cleaning out unused, stale dentries; controlled
via a module option (Luis Henriques)
- Fix various bugs
- Cleanups
The stale dentry cleanup has a patch touching dcache.c: this extracts
a helper from d_prune_aliases() that puts the unused dentry on a
dispose list. Export this and shrink_dentry_list() to modules.
Thanks,
Miklos
---
Bernd Schubert (3):
fuse: Fix whitespace for fuse_uring_args_to_ring() comment
fuse: Invalidate the page cache after FOPEN_DIRECT_IO write
fuse: Always flush the page cache before FOPEN_DIRECT_IO write
Cheng Ding (1):
fuse: missing copy_finish in fuse-over-io-uring argument copies
Dan Carpenter (1):
fuse: Uninitialized variable in fuse_epoch_work()
Darrick J. Wong (1):
fuse: signal that a fuse inode should exhibit local fs behaviors
Joanne Koong (2):
fuse: fix readahead reclaim deadlock
fuse: fix io-uring list corruption for terminated non-committed requests
Luis Henriques (4):
dcache: export shrink_dentry_list() and add new helper
d_dispose_if_unused()
fuse: new work queue to periodically invalidate expired dentries
fuse: new work queue to invalidate dentries from old epochs
fuse: refactor fuse_conn_put() to remove negative logic.
Miklos Szeredi (1):
fuse: add WARN_ON and comment for RCU revalidate
Miquel Sabaté Solà (2):
fuse: use strscpy instead of strcpy
fuse: rename 'namelen' to 'namesize'
---
fs/dcache.c | 18 ++--
fs/fuse/dev.c | 9 +-
fs/fuse/dev_uring.c | 12 ++-
fs/fuse/dir.c | 248 +++++++++++++++++++++++++++++++++++++++++++------
fs/fuse/file.c | 37 ++++++--
fs/fuse/fuse_dev_i.h | 1 +
fs/fuse/fuse_i.h | 28 +++++-
fs/fuse/inode.c | 44 +++++----
fs/overlayfs/super.c | 12 ++-
include/linux/dcache.h | 2 +
10 files changed, 340 insertions(+), 71 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [GIT PULL] fuse update for 6.19 2025-12-04 8:25 [GIT PULL] fuse update for 6.19 Miklos Szeredi @ 2025-12-05 23:47 ` Linus Torvalds 2025-12-06 1:42 ` Al Viro 2025-12-06 0:17 ` pr-tracker-bot 1 sibling, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2025-12-05 23:47 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel On Thu, 4 Dec 2025 at 00:25, Miklos Szeredi <miklos@szeredi.hu> wrote: > > The stale dentry cleanup has a patch touching dcache.c: this extracts > a helper from d_prune_aliases() that puts the unused dentry on a > dispose list. Export this and shrink_dentry_list() to modules. Is that spin_lock(&dentry->d_lock); if (!dentry->d_lockref.count) to_shrink_list(dentry, dispose); spin_unlock(&dentry->d_lock); thing possibly hot, and count might be commonly non-zero? Because it's possible that we could just make it a lockref operation where we atomically don't take the lock if the count is non-zero so that we don't unnecessarily move cachelines around... IOW, some kind of "lockref_lock_if_zero()" pattern? I have no idea what the fuse dentry lifetime patterns might be, maybe this is a complete non-issue... Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-05 23:47 ` Linus Torvalds @ 2025-12-06 1:42 ` Al Viro 2025-12-06 1:52 ` Linus Torvalds 0 siblings, 1 reply; 14+ messages in thread From: Al Viro @ 2025-12-06 1:42 UTC (permalink / raw) To: Linus Torvalds; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Fri, Dec 05, 2025 at 03:47:50PM -0800, Linus Torvalds wrote: > On Thu, 4 Dec 2025 at 00:25, Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > The stale dentry cleanup has a patch touching dcache.c: this extracts > > a helper from d_prune_aliases() that puts the unused dentry on a > > dispose list. Export this and shrink_dentry_list() to modules. > > Is that > > spin_lock(&dentry->d_lock); > if (!dentry->d_lockref.count) > to_shrink_list(dentry, dispose); > spin_unlock(&dentry->d_lock); > > thing possibly hot, and count might be commonly non-zero? > > Because it's possible that we could just make it a lockref operation > where we atomically don't take the lock if the count is non-zero so > that we don't unnecessarily move cachelines around... > > IOW, some kind of "lockref_lock_if_zero()" pattern? > > I have no idea what the fuse dentry lifetime patterns might be, maybe > this is a complete non-issue... Far more interesting question, IMO, is what's to prevent memory pressure from evicting the damn argument right under us. AFAICS, fuse_dentry_tree_work() calls that thing with no locks held. The one and only reason why that's OK in d_prune_aliases() is ->i_lock held over that thing - that's enough to prevent eviction. I don't see anything to serve the same purpose here. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 1:42 ` Al Viro @ 2025-12-06 1:52 ` Linus Torvalds 2025-12-06 2:28 ` Al Viro 0 siblings, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2025-12-06 1:52 UTC (permalink / raw) To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Fri, 5 Dec 2025 at 17:42, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Far more interesting question, IMO, is what's to prevent memory > pressure from evicting the damn argument right under us. That was my first reaction, but look at the 'fuse_dentry_prune()' logic. So if the dentry is removed by the VFS layer, it should be removed here too. But maybe I missed something, Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 1:52 ` Linus Torvalds @ 2025-12-06 2:28 ` Al Viro 2025-12-06 3:10 ` Al Viro 2025-12-06 3:29 ` Linus Torvalds 0 siblings, 2 replies; 14+ messages in thread From: Al Viro @ 2025-12-06 2:28 UTC (permalink / raw) To: Linus Torvalds; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Fri, Dec 05, 2025 at 05:52:51PM -0800, Linus Torvalds wrote: > On Fri, 5 Dec 2025 at 17:42, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > Far more interesting question, IMO, is what's to prevent memory > > pressure from evicting the damn argument right under us. > > That was my first reaction, but look at the 'fuse_dentry_prune()' logic. > > So if the dentry is removed by the VFS layer, it should be removed here too. Sure, ->d_prune() would take it out of the rbtree, but what if it hits rb_erase(&fd->node, &dentry_hash[i].tree); RB_CLEAR_NODE(&fd->node); spin_unlock(&dentry_hash[i].lock); ... right here, when we are not holding any locks anymore? d_dispose_if_unused(fd->dentry, &dispose); cond_resched(); spin_lock(&dentry_hash[i].lock); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 2:28 ` Al Viro @ 2025-12-06 3:10 ` Al Viro 2025-12-06 3:29 ` Linus Torvalds 1 sibling, 0 replies; 14+ messages in thread From: Al Viro @ 2025-12-06 3:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Sat, Dec 06, 2025 at 02:28:26AM +0000, Al Viro wrote: > On Fri, Dec 05, 2025 at 05:52:51PM -0800, Linus Torvalds wrote: > > On Fri, 5 Dec 2025 at 17:42, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > > > Far more interesting question, IMO, is what's to prevent memory > > > pressure from evicting the damn argument right under us. > > > > That was my first reaction, but look at the 'fuse_dentry_prune()' logic. > > > > So if the dentry is removed by the VFS layer, it should be removed here too. > > Sure, ->d_prune() would take it out of the rbtree, but what if it hits > rb_erase(&fd->node, &dentry_hash[i].tree); > RB_CLEAR_NODE(&fd->node); > spin_unlock(&dentry_hash[i].lock); > ... right here, when we are not holding any locks anymore? > d_dispose_if_unused(fd->dentry, &dispose); > cond_resched(); > spin_lock(&dentry_hash[i].lock); ... and with what fuse_dentry_prune() is doing, we can't grab ->d_lock or bump ->d_count before dropping dentry_hash[...].lock. ->d_release() is the one called outside of ->d_lock; ->d_prune() is under it, so we'd get AB-BA deadlock if we tried to do that kind of stuff. Moving the eviction to ->d_release() might be doable; then we'd have fuse locks outside of ->d_lock and could call that thing under those. I'll need to poke around some more, but TBH I don't like that primitive - it's really easy to fuck up and conditions for its safe use are, AFAICS, never spelled out. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 2:28 ` Al Viro 2025-12-06 3:10 ` Al Viro @ 2025-12-06 3:29 ` Linus Torvalds 2025-12-06 3:54 ` Al Viro 1 sibling, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2025-12-06 3:29 UTC (permalink / raw) To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Fri, 5 Dec 2025 at 18:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Sure, ->d_prune() would take it out of the rbtree, but what if it hits Ahh. Maybe increase the d_count before releasing that rbtree lock? Or yeah, maybe moving it to d_release. Miklos? Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 3:29 ` Linus Torvalds @ 2025-12-06 3:54 ` Al Viro 2025-12-06 4:22 ` Al Viro 0 siblings, 1 reply; 14+ messages in thread From: Al Viro @ 2025-12-06 3:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Fri, Dec 05, 2025 at 07:29:13PM -0800, Linus Torvalds wrote: > On Fri, 5 Dec 2025 at 18:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > Sure, ->d_prune() would take it out of the rbtree, but what if it hits > > Ahh. > > Maybe increase the d_count before releasing that rbtree lock? > > Or yeah, maybe moving it to d_release. Miklos? Moving it to ->d_release() would be my preference, TBH. Then we could simply dget() the sucker under the lock and follow that with existing dput_to_list() after dropping the lock... ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 3:54 ` Al Viro @ 2025-12-06 4:22 ` Al Viro 2025-12-08 10:37 ` Miklos Szeredi 2026-01-14 15:23 ` Miklos Szeredi 0 siblings, 2 replies; 14+ messages in thread From: Al Viro @ 2025-12-06 4:22 UTC (permalink / raw) To: Linus Torvalds; +Cc: Miklos Szeredi, linux-fsdevel, linux-kernel On Sat, Dec 06, 2025 at 03:54:03AM +0000, Al Viro wrote: > On Fri, Dec 05, 2025 at 07:29:13PM -0800, Linus Torvalds wrote: > > On Fri, 5 Dec 2025 at 18:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > > > Sure, ->d_prune() would take it out of the rbtree, but what if it hits > > > > Ahh. > > > > Maybe increase the d_count before releasing that rbtree lock? > > > > Or yeah, maybe moving it to d_release. Miklos? > > Moving it to ->d_release() would be my preference, TBH. Then > we could simply dget() the sucker under the lock and follow > that with existing dput_to_list() after dropping the lock... s/dget/grab ->d_lock, increment ->d_count if not negative, drop ->d_lock/ - we need to deal with the possibility of the victim just going into __dentry_kill() as we find it. And yes, it would be better off with something like lockref_get_if_zero(struct lockref *lockref) { bool retval = false; CMPXCHG_LOOP( new.count++; if (old_count != 0) return false; , return true; ); spin_lock(&lockref->lock); if (lockref->count == 0) lockref->count = 1; retval = true; } spin_unlock(&lockref->lock); return retval; } with while (node) { fd = rb_entry(node, struct fuse_dentry, node); if (!time_after64(get_jiffies_64(), fd->time)) break; rb_erase(&fd->node, &dentry_hash[i].tree); RB_CLEAR_NODE(&fd->node); if (lockref_get_if_zero(&dentry->d_lockref)) dput_to_list(dentry); if (need_resched()) { spin_unlock(&dentry_hash[i].lock); schedule(); spin_lock(&dentry_hash[i].lock); } node = rb_first(&dentry_hash[i].tree); } in that loop. Actually... a couple of questions: * why do we call shrink_dentry_list() separately for each hash bucket? Easier to gather everything and call it once... * what's the point of rbtree there? What's wrong with plain hlist? Folks? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 4:22 ` Al Viro @ 2025-12-08 10:37 ` Miklos Szeredi 2026-01-14 15:23 ` Miklos Szeredi 1 sibling, 0 replies; 14+ messages in thread From: Miklos Szeredi @ 2025-12-08 10:37 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel On Sat, 6 Dec 2025 at 05:22, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Sat, Dec 06, 2025 at 03:54:03AM +0000, Al Viro wrote: > > On Fri, Dec 05, 2025 at 07:29:13PM -0800, Linus Torvalds wrote: > > > On Fri, 5 Dec 2025 at 18:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > > > > > Sure, ->d_prune() would take it out of the rbtree, but what if it hits > > > > > > Ahh. > > > > > > Maybe increase the d_count before releasing that rbtree lock? > > > > > > Or yeah, maybe moving it to d_release. Miklos? > > > > Moving it to ->d_release() would be my preference, TBH. Then > > we could simply dget() the sucker under the lock and follow > > that with existing dput_to_list() after dropping the lock... > > s/dget/grab ->d_lock, increment ->d_count if not negative, > drop ->d_lock/ - we need to deal with the possibility of > the victim just going into __dentry_kill() as we find it. > > And yes, it would be better off with something like > lockref_get_if_zero(struct lockref *lockref) > { > bool retval = false; > CMPXCHG_LOOP( > new.count++; > if (old_count != 0) > return false; > , > return true; > ); > spin_lock(&lockref->lock); > if (lockref->count == 0) > lockref->count = 1; > retval = true; > } > spin_unlock(&lockref->lock); > return retval; > } > > with > while (node) { > fd = rb_entry(node, struct fuse_dentry, node); > if (!time_after64(get_jiffies_64(), fd->time)) > break; > rb_erase(&fd->node, &dentry_hash[i].tree); > RB_CLEAR_NODE(&fd->node); > if (lockref_get_if_zero(&dentry->d_lockref)) > dput_to_list(dentry); > if (need_resched()) { > spin_unlock(&dentry_hash[i].lock); > schedule(); > spin_lock(&dentry_hash[i].lock); > } > node = rb_first(&dentry_hash[i].tree); > } > in that loop. Actually... a couple of questions: Looks good. Do you want me to submit a proper patch? > * why do we call shrink_dentry_list() separately for each hash > bucket? Easier to gather everything and call it once... No good reason. > * what's the point of rbtree there? What's wrong with plain > hlist? Folks? The list needs to be ordered wrt. end of validity time. The timeout can be different from one dentry to another even within a fuse fs, but more likely to be varied between different fuse filesystems, so insertion time itself doesn't determine the validity end time. Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-06 4:22 ` Al Viro 2025-12-08 10:37 ` Miklos Szeredi @ 2026-01-14 15:23 ` Miklos Szeredi 2026-04-12 6:16 ` Al Viro 1 sibling, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2026-01-14 15:23 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel On Sat, 6 Dec 2025 at 05:22, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Sat, Dec 06, 2025 at 03:54:03AM +0000, Al Viro wrote: > > On Fri, Dec 05, 2025 at 07:29:13PM -0800, Linus Torvalds wrote: > > > On Fri, 5 Dec 2025 at 18:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > > > > > Sure, ->d_prune() would take it out of the rbtree, but what if it hits > > > > > > Ahh. > > > > > > Maybe increase the d_count before releasing that rbtree lock? > > > > > > Or yeah, maybe moving it to d_release. Miklos? > > > > Moving it to ->d_release() would be my preference, TBH. Then > > we could simply dget() the sucker under the lock and follow > > that with existing dput_to_list() after dropping the lock... > > s/dget/grab ->d_lock, increment ->d_count if not negative, > drop ->d_lock/ - we need to deal with the possibility of > the victim just going into __dentry_kill() as we find it. > > And yes, it would be better off with something like > lockref_get_if_zero(struct lockref *lockref) > { > bool retval = false; > CMPXCHG_LOOP( > new.count++; > if (old_count != 0) > return false; > , > return true; > ); > spin_lock(&lockref->lock); > if (lockref->count == 0) > lockref->count = 1; > retval = true; > } > spin_unlock(&lockref->lock); > return retval; > } > > with > while (node) { > fd = rb_entry(node, struct fuse_dentry, node); > if (!time_after64(get_jiffies_64(), fd->time)) > break; > rb_erase(&fd->node, &dentry_hash[i].tree); > RB_CLEAR_NODE(&fd->node); > if (lockref_get_if_zero(&dentry->d_lockref)) > dput_to_list(dentry); > if (need_resched()) { > spin_unlock(&dentry_hash[i].lock); > schedule(); > spin_lock(&dentry_hash[i].lock); > } > node = rb_first(&dentry_hash[i].tree); > } Posted a short patchset fixing this. I really think there's no point in doing the get-ref, drop-ref dance. Retrieving the dentry from any source will require locking and that needs to nest d_lock with or without the refcount manipulation. So I kept the d_dispose_if_unused() API, but added a note to the function doc that additional locking is necessary to prevent eviction. Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2026-01-14 15:23 ` Miklos Szeredi @ 2026-04-12 6:16 ` Al Viro 2026-04-13 0:29 ` Al Viro 0 siblings, 1 reply; 14+ messages in thread From: Al Viro @ 2026-04-12 6:16 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel On Wed, Jan 14, 2026 at 04:23:04PM +0100, Miklos Szeredi wrote: > Posted a short patchset fixing this. > > I really think there's no point in doing the get-ref, drop-ref dance. > Retrieving the dentry from any source will require locking and that > needs to nest d_lock with or without the refcount manipulation. > > So I kept the d_dispose_if_unused() API, but added a note to the > function doc that additional locking is necessary to prevent eviction. Looking at that thing again, I really hate how subtle it is ;-/ Consider the lockless case in your ->d_release() and try to write down the reasons why it's safe. Among the other things, kfree_rcu() is there to protect RCU callers of ->d_revalidate(); fair enough, but it quietly doubles as delaying that freeing past the scope of dentry_hash[].lock in which you'd done RB_CLEAR_NODE(&fd->node) that had pushed ->d_release() to lockless path. Another side of that fun is the proof that if fuse_dentry_tree_work() sees a dentry, it won't get freed under us. Locked case of ->d_release() is easy; proving that the lockless one is OK without explict barriers is more interesting. Basically, all insertions are ordered wrt ->d_release() (on ->d_lock), so if the value we are observing in RB_EMPTY_NODE() has come from those we will hit the locked case. If the value we observe has come from RB_CLEAR_NODE() in earlier fuse_dentry_tree_work() (these are ordered on dentry_hash[].lock, wrt each other and insertions), there mustn't have been any subsequent insertions, or ->d_release() would've observed the effect of those. If the value has come from the _same_ fuse_dentry_tree_work(), the implicit barrier in spin_lock() would've ordered that store wrt beginning of the scope, and since dentry_free() is ordered wrt ->d_release() the callback of call_rcu() in there wouldn't run until the the end of the scope in question. So I think it's OK, but having it done without a single comment either on barriers or on memory safety... Ouch. Note that we *can* run into a dentry getting killed just after RB_CLEAR_NODE() in there; it's just that store to ->d_flags of dying or killed dentry is safe under ->d_lock and d_dispose_if_unused() is a no-op for dying and killed ones - it's not just "If dentry has no external references, move it to shrink list" as your comment in fs/dcache.c says. And in your case it very much does have an external reference - it's just that it's an equivalent of RCU one, with all the joy that inflicts upon the user. FWIW, right now I'm still looking for the source of the UAF reported by jlayton; I think this stuff in fuse can be excluded as a possible cause, but I'm not happy with that primitive at all... ;-/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2026-04-12 6:16 ` Al Viro @ 2026-04-13 0:29 ` Al Viro 0 siblings, 0 replies; 14+ messages in thread From: Al Viro @ 2026-04-13 0:29 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel On Sun, Apr 12, 2026 at 07:16:45AM +0100, Al Viro wrote: > Looking at that thing again, I really hate how subtle it is ;-/ > Consider the lockless case in your ->d_release() and try to write down > the reasons why it's safe. Among the other things, kfree_rcu() is there > to protect RCU callers of ->d_revalidate(); fair enough, but it quietly > doubles as delaying that freeing past the scope of dentry_hash[].lock in > which you'd done RB_CLEAR_NODE(&fd->node) that had pushed ->d_release() > to lockless path. Another side of that fun is the proof that if > fuse_dentry_tree_work() sees a dentry, it won't get freed under us. > Locked case of ->d_release() is easy; proving that the lockless one > is OK without explict barriers is more interesting. Basically, all > insertions are ordered wrt ->d_release() (on ->d_lock), so if the value > we are observing in RB_EMPTY_NODE() has come from those we will hit the > locked case. If the value we observe has come from RB_CLEAR_NODE() in > earlier fuse_dentry_tree_work() (these are ordered on dentry_hash[].lock, > wrt each other and insertions), there mustn't have been any subsequent > insertions, or ->d_release() would've observed the effect of those. > If the value has come from the _same_ fuse_dentry_tree_work(), the > implicit barrier in spin_lock() would've ordered that store wrt beginning > of the scope, and since dentry_free() is ordered wrt ->d_release() > the callback of call_rcu() in there wouldn't run until the the end of > the scope in question. So I think it's OK, but having it done without > a single comment either on barriers or on memory safety... Ouch. > > Note that we *can* run into a dentry getting killed just after > RB_CLEAR_NODE() in there; it's just that store to ->d_flags of dying > or killed dentry is safe under ->d_lock and d_dispose_if_unused() is > a no-op for dying and killed ones - it's not just "If dentry has no > external references, move it to shrink list" as your comment in > fs/dcache.c says. And in your case it very much does have an > external reference - it's just that it's an equivalent of RCU one, > with all the joy that inflicts upon the user. FWIW, here's what I've ended up with yesterday: /* * ->d_release() is ordered (on ->d_lock) after everything done * while holding counting references, including all calls of * fuse_dentry_add_tree_node(). Freeing a dentry is RCU-delayed * and scheduling it is ordered after ->d_release(). * * In ->d_release() we see the initialization and all stores done by * fuse_dentry_add_tree_node(); since fuse_dentry_tree_work() is * ordered wrt fuse_dentry_add_tree_node() (on dentry_hash[].lock) * we also see all stores from fuse_dentry_tree_work() except possibly * the last one. * * If the value fuse_dentry_release() observes in * if (!RB_EMPTY_NODE(&fd->node)) * has come from initialization, the node had never been inserted * into rbtree and fuse_dentry_tree_work() won't see it at all. * * If the value we observe there comes from the last store in * fuse_dentry_add_tree_node(), RB_EMPTY_NODE(&fd->node) will be * false and we are going to enter a dentry_hash[].lock scope, * providing an ordering wrt fuse_dentry_tree_work() which either * won't find fd in rbtree, or will complete before we * get to even scheduling dentry freeing. * * If the value we observe comes from fuse_dentry_tree_work(), we will * be ordered past the beginning of dentry_hash[].lock scope that store * had happened in. Since call_rcu() that schedules freeing the dentry * is ordered past the return from ->d_release(), freeing won't happen * until the end of that scope. Since there can be no subsequent calls * of fuse_dentry_add_tree_node(), any subsequent calls of * fuse_dentry_tree_work() won't see the node at all. * * Either way, if fuse_dentry_tree_work() sees a node, we are * guaranteed that node->dentry won't get freed under it. * * Freeing the node itself is also RCU-delayed and scheduled in the * very end of FUSE ->d_release(); similar considerations shows that * if fuse_dentry_tree_work() sees a node, we are guaranteed that * the node won't get freed under us either. * * While it is possible to run into a dying (or killed) dentry in * fuse_dentry_tree_work(), d_dispose_if_unused() is safe to be * used for those - it's a no-op in such case. */ Is the above sane? If it is, something similar would be worth having written down somewhere in fs/fuse - it's not trivial and it's not hard to break accidentally... As for the primitive itself... /** * __move_to_shrink_list - try to place a dentry into a shrink list * @dentry: dentry to try putting into shrink list * @list: the list to put @dentry into. * Returns: true @dentry had been placed into @list, false otherwise * * If @dentry is idle and not already include into a shrink list, move * it into @list and return %true; otherwise do nothing and return %false. * * Caller must be holding @dentry->d_lock. There must have been no calls of * dentry_free(@dentry) prior to the beginning of the RCU read-side critical * area in which __move_to_shrink_list(@dentry, @list) is called. * * @list should be thread-private and eventually emptied by passing it to * shrink_dentry_list(). */ bool __move_to_shrink_list(struct dentry *dentry, struct list_head *list) __must_hold(&dentry->d_lock) { if (likely(!dentry->d_lockref.count && !(dentry->d_flags & DCACHE_SHRINK_LIST))) { if (dentry->d_flags & DCACHE_LRU_LIST) d_lru_del(dentry); d_shrink_add(dentry, list); return true; } return false; } EXPORT_SYMBOL(__move_to_shrink_list); With something like spin_lock(&fd->dentry->d_lock); /* If dentry is still referenced, let next dput release it */ fd->dentry->d_flags |= DCACHE_OP_DELETE; __move_to_shrink_list(fd->dentry, &dispose); spin_unlock(&fd->dentry->d_lock); in fuse_dentry_tree_work() (along with the explanations of memory safety, that is) Comments? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [GIT PULL] fuse update for 6.19 2025-12-04 8:25 [GIT PULL] fuse update for 6.19 Miklos Szeredi 2025-12-05 23:47 ` Linus Torvalds @ 2025-12-06 0:17 ` pr-tracker-bot 1 sibling, 0 replies; 14+ messages in thread From: pr-tracker-bot @ 2025-12-06 0:17 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel The pull request you sent on Thu, 4 Dec 2025 09:25:29 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git tags/fuse-update-6.19 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/4b6b4321280ea1ea1e101fd39d8664195d18ecb0 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-04-13 0:25 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-04 8:25 [GIT PULL] fuse update for 6.19 Miklos Szeredi 2025-12-05 23:47 ` Linus Torvalds 2025-12-06 1:42 ` Al Viro 2025-12-06 1:52 ` Linus Torvalds 2025-12-06 2:28 ` Al Viro 2025-12-06 3:10 ` Al Viro 2025-12-06 3:29 ` Linus Torvalds 2025-12-06 3:54 ` Al Viro 2025-12-06 4:22 ` Al Viro 2025-12-08 10:37 ` Miklos Szeredi 2026-01-14 15:23 ` Miklos Szeredi 2026-04-12 6:16 ` Al Viro 2026-04-13 0:29 ` Al Viro 2025-12-06 0:17 ` pr-tracker-bot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox