* [RFC][PATCH] saner calling conventions for ->d_automount()
@ 2025-04-24 6:08 Al Viro
2025-04-24 8:38 ` Christian Brauner
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Al Viro @ 2025-04-24 6:08 UTC (permalink / raw)
To: linux-fsdevel
Cc: Christian Brauner, David Howells, linux-nfs, Steven Rostedt,
Miklos Szeredi, linux-cifs
Currently the calling conventions for ->d_automount() instances have
an odd wart - returned new mount to be attached is expected to have
refcount 2.
That kludge is intended to make sure that mark_mounts_for_expiry() called
before we get around to attaching that new mount to the tree won't decide
to take it out. finish_automount() drops the extra reference after it's
done with attaching mount to the tree - or drops the reference twice in
case of error. ->d_automount() instances have rather counterintuitive
boilerplate in them.
There's a much simpler approach: have mark_mounts_for_expiry() skip the
mounts that are yet to be mounted. And to hell with grabbing/dropping
those extra references. Makes for simpler correctness analysis, at that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index 767b2927c762..749637231773 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1203,3 +1203,10 @@ should use d_drop();d_splice_alias() and return the result of the latter.
If a positive dentry cannot be returned for some reason, in-kernel
clients such as cachefiles, nfsd, smb/server may not perform ideally but
will fail-safe.
+
+---
+
+**mandatory**
+
+Calling conventions for ->d_automount() have changed; we should *not* grab
+an extra reference to new mount - it should be returned with refcount 1.
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index ae79c30b6c0c..cc0a58e96770 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -1411,9 +1411,7 @@ defined:
If a vfsmount is returned, the caller will attempt to mount it
on the mountpoint and will remove the vfsmount from its
- expiration list in the case of failure. The vfsmount should be
- returned with 2 refs on it to prevent automatic expiration - the
- caller will clean up the additional ref.
+ expiration list in the case of failure.
This function is only used if DCACHE_NEED_AUTOMOUNT is set on
the dentry. This is set by __d_instantiate() if S_AUTOMOUNT is
diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index 45cee6534122..9434a5399f2b 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -189,7 +189,6 @@ struct vfsmount *afs_d_automount(struct path *path)
if (IS_ERR(newmnt))
return newmnt;
- mntget(newmnt); /* prevent immediate expiration */
mnt_set_expiry(newmnt, &afs_vfsmounts);
queue_delayed_work(afs_wq, &afs_mntpt_expiry_timer,
afs_mntpt_expiry_timeout * HZ);
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 83ac192e7fdd..05d8584fd3b9 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -319,9 +319,6 @@ static struct vfsmount *fuse_dentry_automount(struct path *path)
/* Create the submount */
mnt = fc_mount(fsc);
- if (!IS_ERR(mnt))
- mntget(mnt);
-
put_fs_context(fsc);
return mnt;
}
diff --git a/fs/namespace.c b/fs/namespace.c
index bbda516444ff..1807ccb1a52d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3903,10 +3903,6 @@ int finish_automount(struct vfsmount *m, const struct path *path)
return PTR_ERR(m);
mnt = real_mount(m);
- /* The new mount record should have at least 2 refs to prevent it being
- * expired before we get a chance to add it
- */
- BUG_ON(mnt_get_count(mnt) < 2);
if (m->mnt_sb == path->mnt->mnt_sb &&
m->mnt_root == dentry) {
@@ -3939,7 +3935,6 @@ int finish_automount(struct vfsmount *m, const struct path *path)
unlock_mount(mp);
if (unlikely(err))
goto discard;
- mntput(m);
return 0;
discard_locked:
@@ -3953,7 +3948,6 @@ int finish_automount(struct vfsmount *m, const struct path *path)
namespace_unlock();
}
mntput(m);
- mntput(m);
return err;
}
@@ -3990,11 +3984,14 @@ void mark_mounts_for_expiry(struct list_head *mounts)
/* extract from the expiration list every vfsmount that matches the
* following criteria:
+ * - already mounted
* - only referenced by its parent vfsmount
* - still marked for expiry (marked on the last call here; marks are
* cleared by mntput())
*/
list_for_each_entry_safe(mnt, next, mounts, mnt_expire) {
+ if (!is_mounted(&mnt->mnt))
+ continue;
if (!xchg(&mnt->mnt_expiry_mark, 1) ||
propagate_mount_busy(mnt, 1))
continue;
diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
index 973aed9cc5fe..7f1ec9c67ff2 100644
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -195,7 +195,6 @@ struct vfsmount *nfs_d_automount(struct path *path)
if (IS_ERR(mnt))
goto out_fc;
- mntget(mnt); /* prevent immediate expiration */
if (timeout <= 0)
goto out_fc;
diff --git a/fs/smb/client/namespace.c b/fs/smb/client/namespace.c
index e3f9213131c4..778daf11f1db 100644
--- a/fs/smb/client/namespace.c
+++ b/fs/smb/client/namespace.c
@@ -283,7 +283,6 @@ struct vfsmount *cifs_d_automount(struct path *path)
return newmnt;
}
- mntget(newmnt); /* prevent immediate expiration */
mnt_set_expiry(newmnt, &cifs_automount_list);
schedule_delayed_work(&cifs_automount_task,
cifs_mountpoint_expiry_timeout);
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8ddf6b17215c..fa488721019f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -10085,8 +10085,6 @@ static struct vfsmount *trace_automount(struct dentry *mntpt, void *ingore)
put_filesystem(type);
if (IS_ERR(mnt))
return NULL;
- mntget(mnt);
-
return mnt;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC][PATCH] saner calling conventions for ->d_automount()
2025-04-24 6:08 [RFC][PATCH] saner calling conventions for ->d_automount() Al Viro
@ 2025-04-24 8:38 ` Christian Brauner
2025-04-24 11:48 ` Jeff Layton
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2025-04-24 8:38 UTC (permalink / raw)
To: Al Viro
Cc: linux-fsdevel, David Howells, linux-nfs, Steven Rostedt,
Miklos Szeredi, linux-cifs
On Thu, Apr 24, 2025 at 07:08:45AM +0100, Al Viro wrote:
> Currently the calling conventions for ->d_automount() instances have
> an odd wart - returned new mount to be attached is expected to have
> refcount 2.
>
> That kludge is intended to make sure that mark_mounts_for_expiry() called
> before we get around to attaching that new mount to the tree won't decide
> to take it out. finish_automount() drops the extra reference after it's
> done with attaching mount to the tree - or drops the reference twice in
> case of error. ->d_automount() instances have rather counterintuitive
> boilerplate in them.
>
> There's a much simpler approach: have mark_mounts_for_expiry() skip the
> mounts that are yet to be mounted. And to hell with grabbing/dropping
> those extra references. Makes for simpler correctness analysis, at that...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Looks good!
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC][PATCH] saner calling conventions for ->d_automount()
2025-04-24 6:08 [RFC][PATCH] saner calling conventions for ->d_automount() Al Viro
2025-04-24 8:38 ` Christian Brauner
@ 2025-04-24 11:48 ` Jeff Layton
2025-04-24 12:37 ` Paulo Alcantara
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2025-04-24 11:48 UTC (permalink / raw)
To: Al Viro, linux-fsdevel
Cc: Christian Brauner, David Howells, linux-nfs, Steven Rostedt,
Miklos Szeredi, linux-cifs
On Thu, 2025-04-24 at 07:08 +0100, Al Viro wrote:
> Currently the calling conventions for ->d_automount() instances have
> an odd wart - returned new mount to be attached is expected to have
> refcount 2.
>
> That kludge is intended to make sure that mark_mounts_for_expiry() called
> before we get around to attaching that new mount to the tree won't decide
> to take it out. finish_automount() drops the extra reference after it's
> done with attaching mount to the tree - or drops the reference twice in
> case of error. ->d_automount() instances have rather counterintuitive
> boilerplate in them.
>
> There's a much simpler approach: have mark_mounts_for_expiry() skip the
> mounts that are yet to be mounted. And to hell with grabbing/dropping
> those extra references. Makes for simpler correctness analysis, at that...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
> index 767b2927c762..749637231773 100644
> --- a/Documentation/filesystems/porting.rst
> +++ b/Documentation/filesystems/porting.rst
> @@ -1203,3 +1203,10 @@ should use d_drop();d_splice_alias() and return the result of the latter.
> If a positive dentry cannot be returned for some reason, in-kernel
> clients such as cachefiles, nfsd, smb/server may not perform ideally but
> will fail-safe.
> +
> +---
> +
> +**mandatory**
> +
> +Calling conventions for ->d_automount() have changed; we should *not* grab
> +an extra reference to new mount - it should be returned with refcount 1.
> diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
> index ae79c30b6c0c..cc0a58e96770 100644
> --- a/Documentation/filesystems/vfs.rst
> +++ b/Documentation/filesystems/vfs.rst
> @@ -1411,9 +1411,7 @@ defined:
>
> If a vfsmount is returned, the caller will attempt to mount it
> on the mountpoint and will remove the vfsmount from its
> - expiration list in the case of failure. The vfsmount should be
> - returned with 2 refs on it to prevent automatic expiration - the
> - caller will clean up the additional ref.
> + expiration list in the case of failure.
>
> This function is only used if DCACHE_NEED_AUTOMOUNT is set on
> the dentry. This is set by __d_instantiate() if S_AUTOMOUNT is
> diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
> index 45cee6534122..9434a5399f2b 100644
> --- a/fs/afs/mntpt.c
> +++ b/fs/afs/mntpt.c
> @@ -189,7 +189,6 @@ struct vfsmount *afs_d_automount(struct path *path)
> if (IS_ERR(newmnt))
> return newmnt;
>
> - mntget(newmnt); /* prevent immediate expiration */
> mnt_set_expiry(newmnt, &afs_vfsmounts);
> queue_delayed_work(afs_wq, &afs_mntpt_expiry_timer,
> afs_mntpt_expiry_timeout * HZ);
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 83ac192e7fdd..05d8584fd3b9 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -319,9 +319,6 @@ static struct vfsmount *fuse_dentry_automount(struct path *path)
>
> /* Create the submount */
> mnt = fc_mount(fsc);
> - if (!IS_ERR(mnt))
> - mntget(mnt);
> -
> put_fs_context(fsc);
> return mnt;
> }
> diff --git a/fs/namespace.c b/fs/namespace.c
> index bbda516444ff..1807ccb1a52d 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -3903,10 +3903,6 @@ int finish_automount(struct vfsmount *m, const struct path *path)
> return PTR_ERR(m);
>
> mnt = real_mount(m);
> - /* The new mount record should have at least 2 refs to prevent it being
> - * expired before we get a chance to add it
> - */
> - BUG_ON(mnt_get_count(mnt) < 2);
>
> if (m->mnt_sb == path->mnt->mnt_sb &&
> m->mnt_root == dentry) {
> @@ -3939,7 +3935,6 @@ int finish_automount(struct vfsmount *m, const struct path *path)
> unlock_mount(mp);
> if (unlikely(err))
> goto discard;
> - mntput(m);
> return 0;
>
> discard_locked:
> @@ -3953,7 +3948,6 @@ int finish_automount(struct vfsmount *m, const struct path *path)
> namespace_unlock();
> }
> mntput(m);
> - mntput(m);
> return err;
> }
>
> @@ -3990,11 +3984,14 @@ void mark_mounts_for_expiry(struct list_head *mounts)
>
> /* extract from the expiration list every vfsmount that matches the
> * following criteria:
> + * - already mounted
> * - only referenced by its parent vfsmount
> * - still marked for expiry (marked on the last call here; marks are
> * cleared by mntput())
> */
> list_for_each_entry_safe(mnt, next, mounts, mnt_expire) {
> + if (!is_mounted(&mnt->mnt))
> + continue;
> if (!xchg(&mnt->mnt_expiry_mark, 1) ||
> propagate_mount_busy(mnt, 1))
> continue;
> diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
> index 973aed9cc5fe..7f1ec9c67ff2 100644
> --- a/fs/nfs/namespace.c
> +++ b/fs/nfs/namespace.c
> @@ -195,7 +195,6 @@ struct vfsmount *nfs_d_automount(struct path *path)
> if (IS_ERR(mnt))
> goto out_fc;
>
> - mntget(mnt); /* prevent immediate expiration */
> if (timeout <= 0)
> goto out_fc;
>
> diff --git a/fs/smb/client/namespace.c b/fs/smb/client/namespace.c
> index e3f9213131c4..778daf11f1db 100644
> --- a/fs/smb/client/namespace.c
> +++ b/fs/smb/client/namespace.c
> @@ -283,7 +283,6 @@ struct vfsmount *cifs_d_automount(struct path *path)
> return newmnt;
> }
>
> - mntget(newmnt); /* prevent immediate expiration */
> mnt_set_expiry(newmnt, &cifs_automount_list);
> schedule_delayed_work(&cifs_automount_task,
> cifs_mountpoint_expiry_timeout);
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 8ddf6b17215c..fa488721019f 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -10085,8 +10085,6 @@ static struct vfsmount *trace_automount(struct dentry *mntpt, void *ingore)
> put_filesystem(type);
> if (IS_ERR(mnt))
> return NULL;
> - mntget(mnt);
> -
> return mnt;
> }
>
>
Much cleaner:
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC][PATCH] saner calling conventions for ->d_automount()
2025-04-24 6:08 [RFC][PATCH] saner calling conventions for ->d_automount() Al Viro
2025-04-24 8:38 ` Christian Brauner
2025-04-24 11:48 ` Jeff Layton
@ 2025-04-24 12:37 ` Paulo Alcantara
2025-04-25 9:43 ` David Howells
2025-04-25 17:16 ` Steven Rostedt
4 siblings, 0 replies; 6+ messages in thread
From: Paulo Alcantara @ 2025-04-24 12:37 UTC (permalink / raw)
To: Al Viro, linux-fsdevel
Cc: Christian Brauner, David Howells, linux-nfs, Steven Rostedt,
Miklos Szeredi, linux-cifs
Al Viro <viro@zeniv.linux.org.uk> writes:
> Currently the calling conventions for ->d_automount() instances have
> an odd wart - returned new mount to be attached is expected to have
> refcount 2.
>
> That kludge is intended to make sure that mark_mounts_for_expiry() called
> before we get around to attaching that new mount to the tree won't decide
> to take it out. finish_automount() drops the extra reference after it's
> done with attaching mount to the tree - or drops the reference twice in
> case of error. ->d_automount() instances have rather counterintuitive
> boilerplate in them.
>
> There's a much simpler approach: have mark_mounts_for_expiry() skip the
> mounts that are yet to be mounted. And to hell with grabbing/dropping
> those extra references. Makes for simpler correctness analysis, at that...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC][PATCH] saner calling conventions for ->d_automount()
2025-04-24 6:08 [RFC][PATCH] saner calling conventions for ->d_automount() Al Viro
` (2 preceding siblings ...)
2025-04-24 12:37 ` Paulo Alcantara
@ 2025-04-25 9:43 ` David Howells
2025-04-25 17:16 ` Steven Rostedt
4 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2025-04-25 9:43 UTC (permalink / raw)
To: Al Viro
Cc: dhowells, linux-fsdevel, Christian Brauner, linux-nfs,
Steven Rostedt, Miklos Szeredi, linux-cifs
Al Viro <viro@zeniv.linux.org.uk> wrote:
> Currently the calling conventions for ->d_automount() instances have
> an odd wart - returned new mount to be attached is expected to have
> refcount 2.
>
> That kludge is intended to make sure that mark_mounts_for_expiry() called
> before we get around to attaching that new mount to the tree won't decide
> to take it out. finish_automount() drops the extra reference after it's
> done with attaching mount to the tree - or drops the reference twice in
> case of error. ->d_automount() instances have rather counterintuitive
> boilerplate in them.
>
> There's a much simpler approach: have mark_mounts_for_expiry() skip the
> mounts that are yet to be mounted. And to hell with grabbing/dropping
> those extra references. Makes for simpler correctness analysis, at that...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
You can test this by installing the kafs-client package, telling systemd to
start afs.mount and then doing "ls /afs/openafs.org/doc", say.
Acked-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC][PATCH] saner calling conventions for ->d_automount()
2025-04-24 6:08 [RFC][PATCH] saner calling conventions for ->d_automount() Al Viro
` (3 preceding siblings ...)
2025-04-25 9:43 ` David Howells
@ 2025-04-25 17:16 ` Steven Rostedt
4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2025-04-25 17:16 UTC (permalink / raw)
To: Al Viro
Cc: linux-fsdevel, Christian Brauner, David Howells, linux-nfs,
Miklos Szeredi, linux-cifs
On Thu, 24 Apr 2025 07:08:45 +0100
Al Viro <viro@zeniv.linux.org.uk> wrote:
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 8ddf6b17215c..fa488721019f 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -10085,8 +10085,6 @@ static struct vfsmount *trace_automount(struct dentry *mntpt, void *ingore)
> put_filesystem(type);
> if (IS_ERR(mnt))
> return NULL;
> - mntget(mnt);
> -
> return mnt;
> }
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-- Steve
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-25 17:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24 6:08 [RFC][PATCH] saner calling conventions for ->d_automount() Al Viro
2025-04-24 8:38 ` Christian Brauner
2025-04-24 11:48 ` Jeff Layton
2025-04-24 12:37 ` Paulo Alcantara
2025-04-25 9:43 ` David Howells
2025-04-25 17:16 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox