public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] fuse update for 6.19
Date: Mon, 13 Apr 2026 23:04:33 +0100	[thread overview]
Message-ID: <20260413220433.GD3836593@ZenIV> (raw)
In-Reply-To: <20260413002940.GA2846532@ZenIV>

On Mon, Apr 13, 2026 at 01:29:41AM +0100, Al Viro wrote:

> As for the primitive itself...
> 
> /**
>  * __move_to_shrink_list - try to place a dentry into a shrink list
>  * @dentry:	dentry to try putting into shrink list
>  * @list:	the list to put @dentry into.
>  * Returns:	true @dentry had been placed into @list, false otherwise
>  *
>  * If @dentry is idle and not already include into a shrink list, move
>  * it into @list and return %true; otherwise do nothing and return %false.
>  *
>  * Caller must be holding @dentry->d_lock.  There must have been no calls of
>  * dentry_free(@dentry) prior to the beginning of the RCU read-side critical
>  * area in which __move_to_shrink_list(@dentry, @list) is called.
>  *
>  * @list should be thread-private and eventually emptied by passing it to
>  * shrink_dentry_list().
>  */
> bool __move_to_shrink_list(struct dentry *dentry, struct list_head *list)

Speaking of safety, it's probably worth adding
struct shrink_list {
	struct list_head __set;
};
and converting the lists used for that to that type.  That would affect

1) function arguments that are borrowed pointers to instances:
	* shrink_dentry_list() argument
	* __move_to_shrink_list() second argument
	* d_shrink_add() second argument
	* __umount_mnt() second argument
	* maybe_free_mountpoint() second argument
	* dput_to_list() second argument
	* d_lru_shrink_move() second argument
	* dentry_list_isolate() third argument/freeable (disguised closure,
          callback + argument passed via list_lru_shrink_walk())
	* dentry_list_isolate_shrink() third argument/freeable (ditto)

2) one global instance:
	* ex_mountpoints; implicitly acquired in the beginning of exclusive
scope of namespace_sem.  Contents moved into auto namespace_unlock():list
a bit before dropping namespace_sem or downgrading it to shared with.
No accesses to emptied object between the move and unlock/downgrade (the
only thing done in that range is checking if notify_list is empty in order
to decide whether to unlock or to downgrade to shared).  All accesses are
done by holders of namespace_sem (exclusive).  Verifying is...  interesting.

3) instances in auto variables - all start empty, all are emptied by
shrink_dentry_list() at the end of scope or nearby:
	* d_prune_aliases():dispose
	* prune_dcache_sb():dispose
	* shrink_dcache_sb():dispose
	* fuse_dentry_tree_work():dispose
	* mntput_no_expire_slowpath():list
	* namespace_unlock():list

On top of that, there's struct collect_data::dispose; the only instance
of struct collect_data is in shrink_dcache_tree(), passed (via a poor man's
closure) to select_collect()/select_umount()/select_collect2().
Emptying the instance is... interesting.  Normally that happens at the
end of scope; however, in case of select_collect2() stopping at the
dying dentry we ask to be notified when the victim stops twitching,
evict whatever we'd collected prior to that point, emptying the shrink list,
then wait for completion.  Inverting the order would lead to pointless
waiting, so shrink_dentry_list() here (and only here, AFAICS) really
shouldn't be reordered to the end of scope.

Operations:
	extraction of the first element (in shrink_dentry_list()).
	(implicitly) removal of element in d_shrink_del().
	element insertion in d_shrink_add().
	list fed to list_lru_isolate_move() as the third argument,
with element insertion done there.
	entire contents spliced to another set (in namespace_unlock()).
	emptiness checks (in shrink_dcache_tree(), some of those redundant,
also in select_collect(), select_collect2() and shrink_dentry_list()).

All of that does look like a good cause for an explicit type.  And
it looks like cleanup.h infrastructure might be of some use for
those, to make it easier to verify that we don't forget to empty
the suckers at the end...

PS: analysis of ex_mountpoints accesses:
	Users: unpin_mountpoint(), umount_mnt(), mnt_change_mountpoint(),
namespace_unlock().  Call trees:
	unpin_mountpoint()
	    __detach_mounts(), in scope of namespace_excl
	    attach_recursive_mnt()
	        graft_tree()
	            do_loopback(), in scope of LOCK_MOUNT
	            do_add_mount()
	                do_new_mount_fc(), in scope of LOCK_MOUNT
	                finish_automount(), in scope of LOCK_MOUNT_EXACT
	        do_move_mount(), in scope of LOCK_MOUNT_MAYBE_BENEATH
	    __unlock_mount()
	        unlock_mount()
	            termination of scope of LOCK_MOUNT{,_EXACT,_EXACT_COPY}
	        lock_mount_exact()
	            failure exit past grabbing namespace_sem exclusive
	umount_mnt()
	    umount_tree()
	        do_umount(), in scope of namespace_excl
	        __detach_mounts(), in scope of namespace_excl
	        copy_tree()
		    propagate_mnt()
			attach_recursive_mnt(), see above
	            __do_loopback()
	                do_loopback(), in scope of LOCK_MOUNT
	                get_detached_copy(), in scope of namespace_excl
	                create_new_namespace(), in scope of LOCK_MOUNT_EXACT_COPY
	            copy_mnt_ns(), in scope of namespace_excl
	        dissolve_on_fput(), in scope of namespace_excl
	        attach_recursive_mnt(), see above
	    __detach_mounts(), in scope of namespace_excl
	    attach_recursive_mnt(), see above
	    path_pivot_root(), in scope of LOCK_MOUNT
	mnt_change_mountpoint()
	    attach_recursive_mnt(), see above
	    reparent()
		propagate_umount()
		    umount_tree(), see above
	namespace_unlock()
	    termination of namespace_excl scope, some of those written
	    weirdly for various reasons.
Entering a LOCK_MOUNT... scope opens a namespace_excl one; leaving -
closes it.
	Overall, moderately unpleasant to verify manually; verifying on
compiler level...  Ouch.  Sure, we could invent an empty type for that
with namespace_excl scope declaring an instance of that type and then
pass it as a token around, but call chains are deep enough to make that
rather clumsy.  Some of that might be helped by the fact that instances
of struct pinned_mountpoint could be treated as bearing such token -
that reduces the call tree quite a bit (e.g. unpin_mountpoint() part
falls out, so does attach_recursive_mnt(), but e.g. mnt_change_mountpoint()
part still includes a bunch of chains like
	mnt_change_mountpoint() called by
	reparent(), which is called by
	propagate_umount(), which is called by
	umount_tree(), which is called by
	copy_tree(), which is called by
	propagate_mnt(), which is called by
	attach_recursive_mnt(), which does
have pinned_mountpoint.  Nothing deeper in that call chain has any use
of pinned_mountpoint in question, so we'd need to propagate the token
through all of that.  Unpleasant...  In any case, that's not going to
happen until the oddball namespace_excl scopes are untangled.

  parent reply	other threads:[~2026-04-13 22:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04  8:25 [GIT PULL] fuse update for 6.19 Miklos Szeredi
2025-12-05 23:47 ` Linus Torvalds
2025-12-06  1:42   ` Al Viro
2025-12-06  1:52     ` Linus Torvalds
2025-12-06  2:28       ` Al Viro
2025-12-06  3:10         ` Al Viro
2025-12-06  3:29         ` Linus Torvalds
2025-12-06  3:54           ` Al Viro
2025-12-06  4:22             ` Al Viro
2025-12-08 10:37               ` Miklos Szeredi
2026-01-14 15:23               ` Miklos Szeredi
2026-04-12  6:16                 ` Al Viro
2026-04-13  0:29                   ` Al Viro
2026-04-13 15:32                     ` Miklos Szeredi
2026-04-13 22:04                     ` Al Viro [this message]
2025-12-06  0:17 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260413220433.GD3836593@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox