* [PATCH v4] statmount: accept fd as a parameter
@ 2025-10-29 5:17 Bhavik Sachdev
2025-10-29 5:33 ` Bhavik Sachdev
0 siblings, 1 reply; 2+ messages in thread
From: Bhavik Sachdev @ 2025-10-29 5:17 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Bhavik Sachdev,
Pavel Tikhomirov, Jan Kara, John Garry, Arnaldo Carvalho de Melo,
Darrick J . Wong, Namhyung Kim, Ingo Molnar, Andrei Vagin,
Alexander Mikhalitsyn
Extend `struct mnt_id_req` to take in a fd and introduce STATMOUNT_BY_FD
flag. When a valid fd is provided and STATMOUNT_BY_FD is set, statmount
will return mountinfo about the mount the fd is on.
This even works for "unmounted" mounts (mounts that have been umounted
using umount2(mnt, MNT_DETACH)), if you have access to a file descriptor
on that mount. These "umounted" mounts will have no mountpoint hence we
return "[unmounted]" and the mnt_ns_id to be 0.
Co-developed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
---
We would like to add support for checkpoint/restoring file descriptors
open on these "unmounted" mounts to CRIU (Checkpoint/Restore in
Userspace) [1].
Currently, we have no way to get mount info for these "unmounted" mounts
since they do appear in /proc/<pid>/mountinfo and statmount does not
work on them, since they do not belong to any mount namespace.
This patch helps us by providing a way to get mountinfo for these
"unmounted" mounts by using a fd on the mount.
Changes from v3 [2] to v4:
* Change the string returned when there is no mountpoint to be
"[unmounted]" instead of "[detached]".
* Remove the new DEFINE_FREE put_file and use the one already present in
include/linux/file.h (fput) [3].
* Inside listmount consistently pass 0 in flags to copy_mnt_id_req and
prepare_klistmount()->grab_requested_mnt_ns() and remove flags from the
prepare_klistmount prototype.
* If STATMOUNT_BY_FD is set, check for mnt_ns_id && mnt_id to be 0.
Changes from v2 [4] to v3:
* Rename STATMOUNT_FD flag to STATMOUNT_BY_FD.
* Fixed UAF bug caused by the reference to fd_mount being bound by scope
of CLASS(fd_raw, f)(kreq.fd) by using fget_raw instead.
* Reused @spare parameter in mnt_id_req instead of adding new fields to
the struct.
Changes from v1 [5] to v2:
v1 of this patchset, took a different approach and introduced a new
umount_mnt_ns, to which "unmounted" mounts would be moved to (instead of
their namespace being NULL) thus allowing them to be still available via
statmount.
Introducing umount_mnt_ns complicated namespace locking and modified
performance sensitive code [6] and it was agreed upon that fd-based
statmount would be better.
[1]: https://github.com/checkpoint-restore/criu/pull/2754
[2]: https://lore.kernel.org/all/20251024181443.786363-1-b.sachdev1904@gmail.com/
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/file.h#n97
[4]: https://lore.kernel.org/linux-fsdevel/20251011124753.1820802-1-b.sachdev1904@gmail.com/
[5]: https://lore.kernel.org/linux-fsdevel/20251002125422.203598-1-b.sachdev1904@gmail.com/
[6]: https://lore.kernel.org/linux-fsdevel/7e4d9eb5-6dde-4c59-8ee3-358233f082d0@virtuozzo.com/
---
fs/namespace.c | 101 ++++++++++++++++++++++++++-----------
include/uapi/linux/mount.h | 7 ++-
2 files changed, 77 insertions(+), 31 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index d82910f33dc4..516de51ab521 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -5207,6 +5207,12 @@ static int statmount_mnt_root(struct kstatmount *s, struct seq_file *seq)
return 0;
}
+static int statmount_mnt_point_unmounted(struct kstatmount *s, struct seq_file *seq)
+{
+ seq_puts(seq, "[unmounted]");
+ return 0;
+}
+
static int statmount_mnt_point(struct kstatmount *s, struct seq_file *seq)
{
struct vfsmount *mnt = s->mnt;
@@ -5262,7 +5268,10 @@ static int statmount_sb_source(struct kstatmount *s, struct seq_file *seq)
static void statmount_mnt_ns_id(struct kstatmount *s, struct mnt_namespace *ns)
{
s->sm.mask |= STATMOUNT_MNT_NS_ID;
- s->sm.mnt_ns_id = ns->ns.ns_id;
+ if (ns)
+ s->sm.mnt_ns_id = ns->ns.ns_id;
+ else
+ s->sm.mnt_ns_id = 0;
}
static int statmount_mnt_opts(struct kstatmount *s, struct seq_file *seq)
@@ -5431,7 +5440,10 @@ static int statmount_string(struct kstatmount *s, u64 flag)
break;
case STATMOUNT_MNT_POINT:
offp = &sm->mnt_point;
- ret = statmount_mnt_point(s, seq);
+ if (!s->root.mnt && !s->root.dentry)
+ ret = statmount_mnt_point_unmounted(s, seq);
+ else
+ ret = statmount_mnt_point(s, seq);
break;
case STATMOUNT_MNT_OPTS:
offp = &sm->mnt_opts;
@@ -5572,29 +5584,33 @@ static int grab_requested_root(struct mnt_namespace *ns, struct path *root)
/* locks: namespace_shared */
static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id,
- struct mnt_namespace *ns)
+ struct mnt_namespace *ns, unsigned int flags)
{
struct mount *m;
int err;
/* Has the namespace already been emptied? */
- if (mnt_ns_id && mnt_ns_empty(ns))
+ if (!(flags & STATMOUNT_BY_FD) && mnt_ns_id && mnt_ns_empty(ns))
return -ENOENT;
- s->mnt = lookup_mnt_in_ns(mnt_id, ns);
- if (!s->mnt)
- return -ENOENT;
+ if (!(flags & STATMOUNT_BY_FD)) {
+ s->mnt = lookup_mnt_in_ns(mnt_id, ns);
+ if (!s->mnt)
+ return -ENOENT;
+ }
- err = grab_requested_root(ns, &s->root);
- if (err)
- return err;
+ if (ns) {
+ err = grab_requested_root(ns, &s->root);
+ if (err)
+ return err;
+ }
/*
* Don't trigger audit denials. We just want to determine what
* mounts to show users.
*/
m = real_mount(s->mnt);
- if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
+ if (ns && !is_path_reachable(m, m->mnt.mnt_root, &s->root) &&
!ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
return -EPERM;
@@ -5718,7 +5734,7 @@ static int prepare_kstatmount(struct kstatmount *ks, struct mnt_id_req *kreq,
}
static int copy_mnt_id_req(const struct mnt_id_req __user *req,
- struct mnt_id_req *kreq)
+ struct mnt_id_req *kreq, unsigned int flags)
{
int ret;
size_t usize;
@@ -5736,11 +5752,16 @@ static int copy_mnt_id_req(const struct mnt_id_req __user *req,
ret = copy_struct_from_user(kreq, sizeof(*kreq), req, usize);
if (ret)
return ret;
- if (kreq->spare != 0)
- return -EINVAL;
- /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */
- if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET)
- return -EINVAL;
+ if (flags & STATMOUNT_BY_FD) {
+ if (kreq->mnt_id || kreq->mnt_ns_id)
+ return -EINVAL;
+ } else {
+ if (kreq->fd != 0)
+ return -EINVAL;
+ /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */
+ if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET)
+ return -EINVAL;
+ }
return 0;
}
@@ -5749,20 +5770,21 @@ static int copy_mnt_id_req(const struct mnt_id_req __user *req,
* that, or if not simply grab a passive reference on our mount namespace and
* return that.
*/
-static struct mnt_namespace *grab_requested_mnt_ns(const struct mnt_id_req *kreq)
+static struct mnt_namespace *grab_requested_mnt_ns(const struct mnt_id_req *kreq,
+ unsigned int flags)
{
struct mnt_namespace *mnt_ns;
- if (kreq->mnt_ns_id && kreq->spare)
+ if (kreq->mnt_ns_id && kreq->fd)
return ERR_PTR(-EINVAL);
if (kreq->mnt_ns_id)
return lookup_mnt_ns(kreq->mnt_ns_id);
- if (kreq->spare) {
+ if (!(flags & STATMOUNT_BY_FD) && kreq->fd) {
struct ns_common *ns;
- CLASS(fd, f)(kreq->spare);
+ CLASS(fd, f)(kreq->fd);
if (fd_empty(f))
return ERR_PTR(-EBADF);
@@ -5788,23 +5810,39 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
{
struct mnt_namespace *ns __free(mnt_ns_release) = NULL;
struct kstatmount *ks __free(kfree) = NULL;
+ struct file *file_from_fd __free(fput) = NULL;
+ struct vfsmount *fd_mnt;
struct mnt_id_req kreq;
/* We currently support retrieval of 3 strings. */
size_t seq_size = 3 * PATH_MAX;
int ret;
- if (flags)
+ if (flags & ~STATMOUNT_BY_FD)
return -EINVAL;
- ret = copy_mnt_id_req(req, &kreq);
+ ret = copy_mnt_id_req(req, &kreq, flags);
if (ret)
return ret;
- ns = grab_requested_mnt_ns(&kreq);
- if (!ns)
- return -ENOENT;
+ if (flags & STATMOUNT_BY_FD) {
+ file_from_fd = fget_raw(kreq.fd);
+ if (!file_from_fd)
+ return -EBADF;
+
+ fd_mnt = file_from_fd->f_path.mnt;
+ ns = real_mount(fd_mnt)->mnt_ns;
+ if (ns)
+ refcount_inc(&ns->passive);
+ else
+ if (!ns_capable_noaudit(file_from_fd->f_cred->user_ns, CAP_SYS_ADMIN))
+ return -ENOENT;
+ } else {
+ ns = grab_requested_mnt_ns(&kreq, flags);
+ if (!ns)
+ return -ENOENT;
+ }
- if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) &&
+ if (ns && (ns != current->nsproxy->mnt_ns) &&
!ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN))
return -ENOENT;
@@ -5817,8 +5855,11 @@ SYSCALL_DEFINE4(statmount, const struct mnt_id_req __user *, req,
if (ret)
return ret;
+ if (flags & STATMOUNT_BY_FD)
+ ks->mnt = fd_mnt;
+
scoped_guard(namespace_shared)
- ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns);
+ ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns, flags);
if (!ret)
ret = copy_statmount_to_user(ks);
@@ -5927,7 +5968,7 @@ static inline int prepare_klistmount(struct klistmount *kls, struct mnt_id_req *
if (!kls->kmnt_ids)
return -ENOMEM;
- kls->ns = grab_requested_mnt_ns(kreq);
+ kls->ns = grab_requested_mnt_ns(kreq, 0);
if (!kls->ns)
return -ENOENT;
@@ -5957,7 +5998,7 @@ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
if (!access_ok(mnt_ids, nr_mnt_ids * sizeof(*mnt_ids)))
return -EFAULT;
- ret = copy_mnt_id_req(req, &kreq);
+ ret = copy_mnt_id_req(req, &kreq, 0);
if (ret)
return ret;
diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
index 7fa67c2031a5..3eaa21d85531 100644
--- a/include/uapi/linux/mount.h
+++ b/include/uapi/linux/mount.h
@@ -197,7 +197,7 @@ struct statmount {
*/
struct mnt_id_req {
__u32 size;
- __u32 spare;
+ __u32 fd;
__u64 mnt_id;
__u64 param;
__u64 mnt_ns_id;
@@ -232,4 +232,9 @@ struct mnt_id_req {
#define LSMT_ROOT 0xffffffffffffffff /* root mount */
#define LISTMOUNT_REVERSE (1 << 0) /* List later mounts first */
+/*
+ * @flag bits for statmount(2)
+ */
+#define STATMOUNT_BY_FD 0x0000001U /* want mountinfo for given fd */
+
#endif /* _UAPI_LINUX_MOUNT_H */
--
2.51.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v4] statmount: accept fd as a parameter
2025-10-29 5:17 [PATCH v4] statmount: accept fd as a parameter Bhavik Sachdev
@ 2025-10-29 5:33 ` Bhavik Sachdev
0 siblings, 0 replies; 2+ messages in thread
From: Bhavik Sachdev @ 2025-10-29 5:33 UTC (permalink / raw)
To: Bhavik Sachdev, Alexander Viro, Christian Brauner
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Pavel Tikhomirov,
Jan Kara, John Garry, Arnaldo Carvalho de Melo, Darrick J . Wong,
Namhyung Kim, Ingo Molnar, Andrei Vagin, Alexander Mikhalitsyn
On Wed Oct 29, 2025 at 10:47 AM IST, Bhavik Sachdev wrote:
> Changes from v3 [2] to v4:
> * Change the string returned when there is no mountpoint to be
> "[unmounted]" instead of "[detached]".
> * Remove the new DEFINE_FREE put_file and use the one already present in
> include/linux/file.h (fput) [3].
> * Inside listmount consistently pass 0 in flags to copy_mnt_id_req and
> prepare_klistmount()->grab_requested_mnt_ns() and remove flags from the
> prepare_klistmount prototype.
> * If STATMOUNT_BY_FD is set, check for mnt_ns_id && mnt_id to be 0.
What I meant to say is that, we check that mnt_ns_id == 0 and mnt_id ==
0, when STATMOUNT_BY_FD is specified.
>
> Changes from v2 [4] to v3:
> * Rename STATMOUNT_FD flag to STATMOUNT_BY_FD.
> * Fixed UAF bug caused by the reference to fd_mount being bound by scope
> of CLASS(fd_raw, f)(kreq.fd) by using fget_raw instead.
> * Reused @spare parameter in mnt_id_req instead of adding new fields to
> the struct.
>
> Changes from v1 [5] to v2:
> v1 of this patchset, took a different approach and introduced a new
> umount_mnt_ns, to which "unmounted" mounts would be moved to (instead of
> their namespace being NULL) thus allowing them to be still available via
> statmount.
>
> Introducing umount_mnt_ns complicated namespace locking and modified
> performance sensitive code [6] and it was agreed upon that fd-based
> statmount would be better.
>
> [1]: https://github.com/checkpoint-restore/criu/pull/2754
> [2]: https://lore.kernel.org/all/20251024181443.786363-1-b.sachdev1904@gmail.com/
> [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/file.h#n97
> [4]: https://lore.kernel.org/linux-fsdevel/20251011124753.1820802-1-b.sachdev1904@gmail.com/
> [5]: https://lore.kernel.org/linux-fsdevel/20251002125422.203598-1-b.sachdev1904@gmail.com/
> [6]: https://lore.kernel.org/linux-fsdevel/7e4d9eb5-6dde-4c59-8ee3-358233f082d0@virtuozzo.com/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-10-29 5:33 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29 5:17 [PATCH v4] statmount: accept fd as a parameter Bhavik Sachdev
2025-10-29 5:33 ` Bhavik Sachdev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).