public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/1] statmount: mountinfo for "unmounted" mounts
@ 2025-10-11 12:46 Bhavik Sachdev
  2025-10-11 12:46 ` [PATCH v2 1/1] statmount: accept fd as a parameter Bhavik Sachdev
  0 siblings, 1 reply; 6+ messages in thread
From: Bhavik Sachdev @ 2025-10-11 12:46 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner
  Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Bhavik Sachdev,
	Pavel Tikhomirov, Jan Kara, John Garry, Arnaldo Carvalho de Melo,
	Darrick J . Wong, Namhyung Kim, Ingo Molnar, Andrei Vagin,
	Alexander Mikhalitsyn

By "unmounted" mounts we mean mounts that have been unmounted using
umount2(mnt, MNT_DETACH) but we still have file descriptors to files on
that mount. We want to add the ability to handle such mounts in CRIU
(Checkpoint/Restore in Userspace).

Currently, we have no way to get mount info for these mounts as they do
not appear in /proc/<pid>/mountinfo and statmount does not work on them.

We solve this problem by introducing a file descriptor parameter to
statmount, along with a STATMOUNT_FD flag. Even if this file descriptor
is on a "unmounted" mount we are still able to get mountinfo for the
mount. We report the mountpoint of the mount to be "[detached]" and
mnt_ns_id to be 0.

v1 of this patchset, took a different approach and introduced a new
umount_mnt_ns, to which "unmounted" mounts would be moved to (instead of
their namespace being NULL) thus allowing them to be still available via
statmount:
https://lore.kernel.org/linux-fsdevel/20251002125422.203598-1-b.sachdev1904@gmail.com/

That approach complicated namespace locking and modified performance
sensitive code.
See: https://lore.kernel.org/linux-fsdevel/7e4d9eb5-6dde-4c59-8ee3-358233f082d0@virtuozzo.com/

Christian also talked about a separate approach of tying the _lifetime_
of the mount namespace to the lifetime of the unmounted mounts through
the passive reference count by moving them to a separate rb_root
`unmounted` in the namespace instance.

This approach has a few problems, some of them are:
1. It further extends the scope of the namespace semaphore.
2. Weird to be able to statmount() via mount id if the mount namespace
is dead.

For a more complete description, see:
https://lore.kernel.org/linux-fsdevel/20251006-erlesen-anlagen-9af59899a969@brauner/

Aleska Sarai also pointed out that this fd based approach is similiar to
the fstatfs(2) which returns information about a mounted filesystem when
given a fd open on that filesystem.
https://lore.kernel.org/linux-fsdevel/2025-10-07-lavish-refried-navy-journey-EqHk9K@cyphar.com/

We use this patchset with CRIU to support checkpoint/restore of
"unmounted" mounts in this pull request:
https://github.com/checkpoint-restore/criu/pull/2754.

All these patches are also available in this branch on github:
https://github.com/bsach64/linux/tree/statmount-fd-v2

Bhavik Sachdev (1):
  statmount: accept fd as a parameter

 fs/namespace.c             | 80 ++++++++++++++++++++++++++++----------
 include/uapi/linux/mount.h |  8 ++++
 2 files changed, 67 insertions(+), 21 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/1] statmount: accept fd as a parameter
  2025-10-11 12:46 [PATCH v2 0/1] statmount: mountinfo for "unmounted" mounts Bhavik Sachdev
@ 2025-10-11 12:46 ` Bhavik Sachdev
  2025-10-21 12:11   ` Christian Brauner
  2025-10-22 16:32   ` Miklos Szeredi
  0 siblings, 2 replies; 6+ messages in thread
From: Bhavik Sachdev @ 2025-10-11 12:46 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner
  Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Bhavik Sachdev,
	Pavel Tikhomirov, Jan Kara, John Garry, Arnaldo Carvalho de Melo,
	Darrick J . Wong, Namhyung Kim, Ingo Molnar, Andrei Vagin,
	Alexander Mikhalitsyn

Extend `struct mnt_id_req` to take in a fd and introduce STATMOUNT_FD
flag. When a valid fd is provided and STATMOUNT_FD is set, statmount
will return mountinfo about the mount the fd is on.

This even works for "unmounted" mounts (mounts that have been umounted
using umount2(mnt, MNT_DETACH)), if you have access to a file descriptor
on that mount. These "umounted" mounts will have no mountpoint hence we
return "[detached]" and the mnt_ns_id to be 0.

Co-developed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
---
 fs/namespace.c             | 80 ++++++++++++++++++++++++++++----------
 include/uapi/linux/mount.h |  8 ++++
 2 files changed, 67 insertions(+), 21 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index d82910f33dc4..eb82a22cffd5 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -5207,6 +5207,12 @@ static int statmount_mnt_root(struct kstatmount *s, struct seq_file *seq)
 	return 0;
 }
 
+static int statmount_mnt_point_detached(struct kstatmount *s, struct seq_file *seq)
+{
+	seq_puts(seq, "[detached]");
+	return 0;
+}
+
 static int statmount_mnt_point(struct kstatmount *s, struct seq_file *seq)
 {
 	struct vfsmount *mnt = s->mnt;
@@ -5262,7 +5268,10 @@ static int statmount_sb_source(struct kstatmount *s, struct seq_file *seq)
 static void statmount_mnt_ns_id(struct kstatmount *s, struct mnt_namespace *ns)
 {
 	s->sm.mask |= STATMOUNT_MNT_NS_ID;
-	s->sm.mnt_ns_id = ns->ns.ns_id;
+	if (ns)
+		s->sm.mnt_ns_id = ns->ns.ns_id;
+	else
+		s->sm.mnt_ns_id = 0;
 }
 
 static int statmount_mnt_opts(struct kstatmount *s, struct seq_file *seq)
@@ -5431,7 +5440,10 @@ static int statmount_string(struct kstatmount *s, u64 flag)
 		break;
 	case STATMOUNT_MNT_POINT:
 		offp = &sm->mnt_point;
-		ret = statmount_mnt_point(s, seq);
+		if (!s->root.mnt && !s->root.dentry)
+			ret = statmount_mnt_point_detached(s, seq);
+		else
+			ret = statmount_mnt_point(s, seq);
 		break;
 	case STATMOUNT_MNT_OPTS:
 		offp = &sm->mnt_opts;
@@ -5572,29 +5584,33 @@ static int grab_requested_root(struct mnt_namespace *ns, struct path *root)
 
 /* locks: namespace_shared */
 static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
-			struct mnt_namespace *ns)
+			struct mnt_namespace *ns, unsigned int flags)
 {
 	struct mount *m;
 	int err;
 
 	/* Has the namespace already been emptied? */
-	if (mnt_ns_id && mnt_ns_empty(ns))
+	if (!(flags & STATMOUNT_FD) && mnt_ns_id && mnt_ns_empty(ns))
 		return -ENOENT;
 
-	s->mnt = lookup_mnt_in_ns(mnt_id, ns);
-	if (!s->mnt)
-		return -ENOENT;
+	if (!(flags & STATMOUNT_FD)) {
+		s->mnt = lookup_mnt_in_ns(mnt_id, ns);
+		if (!s->mnt)
+			return -ENOENT;
+	}
 
-	err = grab_requested_root(ns, &s->root);
-	if (err)
-		return err;
+	if (ns) {
+		err = grab_requested_root(ns, &s->root);
+		if (err)
+			return err;
+	}
 
 	/*
 	 * Don't trigger audit denials. We just want to determine what
 	 * mounts to show users.
 	 */
 	m = real_mount(s->mnt);
-	if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
+	if (ns && !is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
 	    !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
 		return -EPERM;
 
@@ -5718,12 +5734,12 @@ static int prepare_kstatmount(struct kstatmount *ks, struct mnt_id_req *kreq,
 }
 
 static int copy_mnt_id_req(const struct mnt_id_req __user *req,
-			   struct mnt_id_req *kreq)
+			   struct mnt_id_req *kreq, unsigned int flags)
 {
 	int ret;
 	size_t usize;
 
-	BUILD_BUG_ON(sizeof(struct mnt_id_req) != MNT_ID_REQ_SIZE_VER1);
+	BUILD_BUG_ON(sizeof(struct mnt_id_req) != MNT_ID_REQ_SIZE_VER2);
 
 	ret = get_user(usize, &req->size);
 	if (ret)
@@ -5738,6 +5754,11 @@ static int copy_mnt_id_req(const struct mnt_id_req __user *req,
 		return ret;
 	if (kreq->spare != 0)
 		return -EINVAL;
+	if (flags & STATMOUNT_FD) {
+		if (kreq->fd < 0)
+			return -EINVAL;
+		return 0;
+	}
 	/* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */
 	if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET)
 		return -EINVAL;
@@ -5788,23 +5809,37 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
 {
 	struct mnt_namespace *ns __free(mnt_ns_release) = NULL;
 	struct kstatmount *ks __free(kfree) = NULL;
+	struct vfsmount *fd_mnt;
 	struct mnt_id_req kreq;
 	/* We currently support retrieval of 3 strings. */
 	size_t seq_size = 3 * PATH_MAX;
 	int ret;
 
-	if (flags)
+	if (flags & ~STATMOUNT_FD)
 		return -EINVAL;
 
-	ret = copy_mnt_id_req(req, &kreq);
+	ret = copy_mnt_id_req(req, &kreq, flags);
 	if (ret)
 		return ret;
 
-	ns = grab_requested_mnt_ns(&kreq);
-	if (!ns)
-		return -ENOENT;
+	if (flags & STATMOUNT_FD) {
+		CLASS(fd_raw, f)(kreq.fd);
+		if (fd_empty(f))
+			return -EBADF;
+		fd_mnt = fd_file(f)->f_path.mnt;
+		ns = real_mount(fd_mnt)->mnt_ns;
+		if (ns)
+			refcount_inc(&ns->passive);
+		else
+			if (!ns_capable_noaudit(fd_file(f)->f_cred->user_ns, CAP_SYS_ADMIN))
+				return -ENOENT;
+	} else {
+		ns = grab_requested_mnt_ns(&kreq);
+		if (!ns)
+			return -ENOENT;
+	}
 
-	if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) &&
+	if (ns && (ns != current->nsproxy->mnt_ns) &&
 	    !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
 		return -ENOENT;
 
@@ -5817,8 +5852,11 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
 	if (ret)
 		return ret;
 
+	if (flags & STATMOUNT_FD)
+		ks->mnt = fd_mnt;
+
 	scoped_guard(namespace_shared)
-		ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns);
+		ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns, flags);
 
 	if (!ret)
 		ret = copy_statmount_to_user(ks);
@@ -5957,7 +5995,7 @@ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
 	if (!access_ok(mnt_ids, nr_mnt_ids * sizeof(*mnt_ids)))
 		return -EFAULT;
 
-	ret = copy_mnt_id_req(req, &kreq);
+	ret = copy_mnt_id_req(req, &kreq, 0);
 	if (ret)
 		return ret;
 
diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
index 7fa67c2031a5..dfe8b8e7fa8d 100644
--- a/include/uapi/linux/mount.h
+++ b/include/uapi/linux/mount.h
@@ -201,11 +201,14 @@ struct mnt_id_req {
 	__u64 mnt_id;
 	__u64 param;
 	__u64 mnt_ns_id;
+	__s32 fd;
+	__u32 spare2;
 };
 
 /* List of all mnt_id_req versions. */
 #define MNT_ID_REQ_SIZE_VER0	24 /* sizeof first published struct */
 #define MNT_ID_REQ_SIZE_VER1	32 /* sizeof second published struct */
+#define MNT_ID_REQ_SIZE_VER2	40 /* sizeof third published struct */
 
 /*
  * @mask bits for statmount(2)
@@ -232,4 +235,9 @@ struct mnt_id_req {
 #define LSMT_ROOT		0xffffffffffffffff	/* root mount */
 #define LISTMOUNT_REVERSE	(1 << 0) /* List later mounts first */
 
+/*
+ * @flag bits for statmount(2)
+ */
+#define STATMOUNT_FD		0x0000001U /* want mountinfo for given fd */
+
 #endif /* _UAPI_LINUX_MOUNT_H */
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] statmount: accept fd as a parameter
  2025-10-11 12:46 ` [PATCH v2 1/1] statmount: accept fd as a parameter Bhavik Sachdev
@ 2025-10-21 12:11   ` Christian Brauner
  2025-10-22 15:39     ` Bhavik Sachdev
  2025-10-22 16:32   ` Miklos Szeredi
  1 sibling, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2025-10-21 12:11 UTC (permalink / raw)
  To: Bhavik Sachdev
  Cc: Alexander Viro, linux-fsdevel, linux-kernel, Aleksa Sarai,
	Pavel Tikhomirov, Jan Kara, John Garry, Arnaldo Carvalho de Melo,
	Darrick J . Wong, Namhyung Kim, Ingo Molnar, Andrei Vagin,
	Alexander Mikhalitsyn

On Sat, Oct 11, 2025 at 06:16:11PM +0530, Bhavik Sachdev wrote:
> Extend `struct mnt_id_req` to take in a fd and introduce STATMOUNT_FD
> flag. When a valid fd is provided and STATMOUNT_FD is set, statmount
> will return mountinfo about the mount the fd is on.
> 
> This even works for "unmounted" mounts (mounts that have been umounted
> using umount2(mnt, MNT_DETACH)), if you have access to a file descriptor
> on that mount. These "umounted" mounts will have no mountpoint hence we
> return "[detached]" and the mnt_ns_id to be 0.
> 
> Co-developed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
> Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
> ---
>  fs/namespace.c             | 80 ++++++++++++++++++++++++++++----------
>  include/uapi/linux/mount.h |  8 ++++
>  2 files changed, 67 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index d82910f33dc4..eb82a22cffd5 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -5207,6 +5207,12 @@ static int statmount_mnt_root(struct kstatmount *s, struct seq_file *seq)
>  	return 0;
>  }
>  
> +static int statmount_mnt_point_detached(struct kstatmount *s, struct seq_file *seq)
> +{
> +	seq_puts(seq, "[detached]");
> +	return 0;
> +}
> +
>  static int statmount_mnt_point(struct kstatmount *s, struct seq_file *seq)
>  {
>  	struct vfsmount *mnt = s->mnt;
> @@ -5262,7 +5268,10 @@ static int statmount_sb_source(struct kstatmount *s, struct seq_file *seq)
>  static void statmount_mnt_ns_id(struct kstatmount *s, struct mnt_namespace *ns)
>  {
>  	s->sm.mask |= STATMOUNT_MNT_NS_ID;
> -	s->sm.mnt_ns_id = ns->ns.ns_id;
> +	if (ns)
> +		s->sm.mnt_ns_id = ns->ns.ns_id;
> +	else
> +		s->sm.mnt_ns_id = 0;
>  }
>  
>  static int statmount_mnt_opts(struct kstatmount *s, struct seq_file *seq)
> @@ -5431,7 +5440,10 @@ static int statmount_string(struct kstatmount *s, u64 flag)
>  		break;
>  	case STATMOUNT_MNT_POINT:
>  		offp = &sm->mnt_point;
> -		ret = statmount_mnt_point(s, seq);
> +		if (!s->root.mnt && !s->root.dentry)
> +			ret = statmount_mnt_point_detached(s, seq);
> +		else
> +			ret = statmount_mnt_point(s, seq);
>  		break;
>  	case STATMOUNT_MNT_OPTS:
>  		offp = &sm->mnt_opts;
> @@ -5572,29 +5584,33 @@ static int grab_requested_root(struct mnt_namespace *ns, struct path *root)
>  
>  /* locks: namespace_shared */
>  static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
> -			struct mnt_namespace *ns)
> +			struct mnt_namespace *ns, unsigned int flags)
>  {
>  	struct mount *m;
>  	int err;
>  
>  	/* Has the namespace already been emptied? */
> -	if (mnt_ns_id && mnt_ns_empty(ns))
> +	if (!(flags & STATMOUNT_FD) && mnt_ns_id && mnt_ns_empty(ns))
>  		return -ENOENT;
>  
> -	s->mnt = lookup_mnt_in_ns(mnt_id, ns);
> -	if (!s->mnt)
> -		return -ENOENT;
> +	if (!(flags & STATMOUNT_FD)) {
> +		s->mnt = lookup_mnt_in_ns(mnt_id, ns);
> +		if (!s->mnt)
> +			return -ENOENT;
> +	}
>  
> -	err = grab_requested_root(ns, &s->root);
> -	if (err)
> -		return err;
> +	if (ns) {
> +		err = grab_requested_root(ns, &s->root);
> +		if (err)
> +			return err;
> +	}
>  
>  	/*
>  	 * Don't trigger audit denials. We just want to determine what
>  	 * mounts to show users.
>  	 */
>  	m = real_mount(s->mnt);
> -	if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
> +	if (ns && !is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
>  	    !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> @@ -5718,12 +5734,12 @@ static int prepare_kstatmount(struct kstatmount *ks, struct mnt_id_req *kreq,
>  }
>  
>  static int copy_mnt_id_req(const struct mnt_id_req __user *req,
> -			   struct mnt_id_req *kreq)
> +			   struct mnt_id_req *kreq, unsigned int flags)
>  {
>  	int ret;
>  	size_t usize;
>  
> -	BUILD_BUG_ON(sizeof(struct mnt_id_req) != MNT_ID_REQ_SIZE_VER1);
> +	BUILD_BUG_ON(sizeof(struct mnt_id_req) != MNT_ID_REQ_SIZE_VER2);
>  
>  	ret = get_user(usize, &req->size);
>  	if (ret)
> @@ -5738,6 +5754,11 @@ static int copy_mnt_id_req(const struct mnt_id_req __user *req,
>  		return ret;
>  	if (kreq->spare != 0)
>  		return -EINVAL;
> +	if (flags & STATMOUNT_FD) {
> +		if (kreq->fd < 0)
> +			return -EINVAL;
> +		return 0;
> +	}
>  	/* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */
>  	if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET)
>  		return -EINVAL;
> @@ -5788,23 +5809,37 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
>  {
>  	struct mnt_namespace *ns __free(mnt_ns_release) = NULL;
>  	struct kstatmount *ks __free(kfree) = NULL;
> +	struct vfsmount *fd_mnt;
>  	struct mnt_id_req kreq;
>  	/* We currently support retrieval of 3 strings. */
>  	size_t seq_size = 3 * PATH_MAX;
>  	int ret;
>  
> -	if (flags)
> +	if (flags & ~STATMOUNT_FD)
>  		return -EINVAL;
>  
> -	ret = copy_mnt_id_req(req, &kreq);
> +	ret = copy_mnt_id_req(req, &kreq, flags);
>  	if (ret)
>  		return ret;
>  
> -	ns = grab_requested_mnt_ns(&kreq);
> -	if (!ns)
> -		return -ENOENT;
> +	if (flags & STATMOUNT_FD) {
> +		CLASS(fd_raw, f)(kreq.fd);
> +		if (fd_empty(f))
> +			return -EBADF;
> +		fd_mnt = fd_file(f)->f_path.mnt;
> +		ns = real_mount(fd_mnt)->mnt_ns;
> +		if (ns)
> +			refcount_inc(&ns->passive);
> +		else
> +			if (!ns_capable_noaudit(fd_file(f)->f_cred->user_ns, CAP_SYS_ADMIN))
> +				return -ENOENT;
> +	} else {
> +		ns = grab_requested_mnt_ns(&kreq);
> +		if (!ns)
> +			return -ENOENT;
> +	}
>  
> -	if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) &&
> +	if (ns && (ns != current->nsproxy->mnt_ns) &&
>  	    !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
>  		return -ENOENT;
>  
> @@ -5817,8 +5852,11 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
>  	if (ret)
>  		return ret;
>  
> +	if (flags & STATMOUNT_FD)
> +		ks->mnt = fd_mnt;

The reference to fd_mount is bound to the scope of CLASS(fd_raw, f)(kreq.fd) above.
That means you don't hold a reference to fd_mnt here and so this is a UAF waiting to happen.

> +
>  	scoped_guard(namespace_shared)
> -		ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns);
> +		ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns, flags);
>  
>  	if (!ret)
>  		ret = copy_statmount_to_user(ks);
> @@ -5957,7 +5995,7 @@ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
>  	if (!access_ok(mnt_ids, nr_mnt_ids * sizeof(*mnt_ids)))
>  		return -EFAULT;
>  
> -	ret = copy_mnt_id_req(req, &kreq);
> +	ret = copy_mnt_id_req(req, &kreq, 0);
>  	if (ret)
>  		return ret;
>  
> diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
> index 7fa67c2031a5..dfe8b8e7fa8d 100644
> --- a/include/uapi/linux/mount.h
> +++ b/include/uapi/linux/mount.h
> @@ -201,11 +201,14 @@ struct mnt_id_req {
>  	__u64 mnt_id;
>  	__u64 param;
>  	__u64 mnt_ns_id;
> +	__s32 fd;
> +	__u32 spare2;
>  };

Hm, do you really need a new field? You could just use the @spare
parameter in struct mnt_id_req. It's currently validated of not being
allowed to be non-zero in copy_mnt_id_req() which is used by both
statmount() and listmount().

I think you could just reuse it for this purpose in statmount(). And
then maybe the flag should be STATMOUNT_BY_FD?

Otherwise I think this could work.

>  
>  /* List of all mnt_id_req versions. */
>  #define MNT_ID_REQ_SIZE_VER0	24 /* sizeof first published struct */
>  #define MNT_ID_REQ_SIZE_VER1	32 /* sizeof second published struct */
> +#define MNT_ID_REQ_SIZE_VER2	40 /* sizeof third published struct */
>  
>  /*
>   * @mask bits for statmount(2)
> @@ -232,4 +235,9 @@ struct mnt_id_req {
>  #define LSMT_ROOT		0xffffffffffffffff	/* root mount */
>  #define LISTMOUNT_REVERSE	(1 << 0) /* List later mounts first */
>  
> +/*
> + * @flag bits for statmount(2)
> + */
> +#define STATMOUNT_FD		0x0000001U /* want mountinfo for given fd */
> +
>  #endif /* _UAPI_LINUX_MOUNT_H */
> -- 
> 2.51.0
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] statmount: accept fd as a parameter
  2025-10-21 12:11   ` Christian Brauner
@ 2025-10-22 15:39     ` Bhavik Sachdev
  0 siblings, 0 replies; 6+ messages in thread
From: Bhavik Sachdev @ 2025-10-22 15:39 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, linux-fsdevel, linux-kernel, Aleksa Sarai,
	Pavel Tikhomirov, Jan Kara, John Garry, Arnaldo Carvalho de Melo,
	Darrick J . Wong, Namhyung Kim, Ingo Molnar, Andrei Vagin,
	Alexander Mikhalitsyn

On Tue Oct 21, 2025 at 5:41 PM IST, Christian Brauner wrote:
> Hm, do you really need a new field? You could just use the @spare
> parameter in struct mnt_id_req. It's currently validated of not being
> allowed to be non-zero in copy_mnt_id_req() which is used by both
> statmount() and listmount().
>
> I think you could just reuse it for this purpose in statmount(). And
> then maybe the flag should be STATMOUNT_BY_FD?
>
We made a new field because we thought @spare is already being used (or
will have a future use?). grab_requested_mnt_ns uses @spare as a mount
namespace fd [1], but we also only allow @spare to be 0, so I don't
really understand whats happening here, is this functionality disabled?
> Otherwise I think this could work.
>
Thanks, Christian! I will send a new patch with all your requested
changes.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7b9d14af8777ac439bbfa9ac73a12a6d85289e7e

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] statmount: accept fd as a parameter
  2025-10-11 12:46 ` [PATCH v2 1/1] statmount: accept fd as a parameter Bhavik Sachdev
  2025-10-21 12:11   ` Christian Brauner
@ 2025-10-22 16:32   ` Miklos Szeredi
  2025-10-22 18:12     ` Bhavik Sachdev
  1 sibling, 1 reply; 6+ messages in thread
From: Miklos Szeredi @ 2025-10-22 16:32 UTC (permalink / raw)
  To: Bhavik Sachdev
  Cc: Alexander Viro, Christian Brauner, linux-fsdevel, linux-kernel,
	Aleksa Sarai, Pavel Tikhomirov, Jan Kara, John Garry,
	Arnaldo Carvalho de Melo, Darrick J . Wong, Namhyung Kim,
	Ingo Molnar, Andrei Vagin, Alexander Mikhalitsyn

On Sat, 11 Oct 2025 at 14:48, Bhavik Sachdev <b.sachdev1904@gmail.com> wrote:
>
> Extend `struct mnt_id_req` to take in a fd and introduce STATMOUNT_FD
> flag. When a valid fd is provided and STATMOUNT_FD is set, statmount
> will return mountinfo about the mount the fd is on.

What's wrong with statx + statmount?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] statmount: accept fd as a parameter
  2025-10-22 16:32   ` Miklos Szeredi
@ 2025-10-22 18:12     ` Bhavik Sachdev
  0 siblings, 0 replies; 6+ messages in thread
From: Bhavik Sachdev @ 2025-10-22 18:12 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Alexander Viro, Christian Brauner, linux-fsdevel, linux-kernel,
	Aleksa Sarai, Pavel Tikhomirov, Jan Kara, John Garry,
	Arnaldo Carvalho de Melo, Darrick J . Wong, Namhyung Kim,
	Ingo Molnar, Andrei Vagin, Alexander Mikhalitsyn

On Wed Oct 22, 2025 at 10:02 PM IST, Miklos Szeredi wrote:
> What's wrong with statx + statmount?

We would like to get mountinfo for "unmounted" mounts i.e we have an fd
on a mount that has been unmounted with MNT_DETACH. statmount() does not
work on such mounts (with the mnt_id_unique from statx), since they have
no mount namespace. These mounts also don't show up in proc.

v1 of this patch tried a different approach by introducing a new mount
namespace for "unmounted" mounts, which had a bunch of complications
[1]. The cover letter for this patch also has more information [2].

We want to support checkpoint/restore of such fds with CRIU [3].

[1]: https://lore.kernel.org/all/20251006-erlesen-anlagen-9af59899a969@brauner/
[2]: https://lore.kernel.org/all/20251011124753.1820802-1-b.sachdev1904@gmail.com/
[3]: https://github.com/checkpoint-restore/criu/pull/2754

Kind regards,
Bhavik

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-10-22 18:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-11 12:46 [PATCH v2 0/1] statmount: mountinfo for "unmounted" mounts Bhavik Sachdev
2025-10-11 12:46 ` [PATCH v2 1/1] statmount: accept fd as a parameter Bhavik Sachdev
2025-10-21 12:11   ` Christian Brauner
2025-10-22 15:39     ` Bhavik Sachdev
2025-10-22 16:32   ` Miklos Szeredi
2025-10-22 18:12     ` Bhavik Sachdev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox