From: Christian Brauner <brauner@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: "Jan Kara" <jack@suse.cz>,
linux-fsdevel@vger.kernel.org,
"Josef Bacik" <josef@toxicpanda.com>,
"Jeff Layton" <jlayton@kernel.org>, "Mike Yuan" <me@yhndnzj.com>,
"Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>,
"Lennart Poettering" <mzxreary@0pointer.de>,
"Daan De Meyer" <daan.j.demeyer@gmail.com>,
"Aleksa Sarai" <cyphar@cyphar.com>,
"Alexander Viro" <viro@zeniv.linux.org.uk>,
"Jens Axboe" <axboe@kernel.dk>, "Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Simon Horman" <horms@kernel.org>,
"Chuck Lever" <chuck.lever@oracle.com>,
linux-nfs@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
cgroups@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH 27/32] nsfs: support file handles
Date: Thu, 11 Sep 2025 11:31:30 +0200 [thread overview]
Message-ID: <20250911-werken-raubzug-64735473739c@brauner> (raw)
In-Reply-To: <CAOQ4uxgtQQa-jzsnTBxgUTPzgtCiAaH8X6ffMqd+1Y5Jjy0dmQ@mail.gmail.com>
On Wed, Sep 10, 2025 at 07:21:22PM +0200, Amir Goldstein wrote:
> On Wed, Sep 10, 2025 at 4:39 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > A while ago we added support for file handles to pidfs so pidfds can be
> > encoded and decoded as file handles. Userspace has adopted this quickly
> > and it's proven very useful.
>
> > Pidfd file handles are exhaustive meaning
> > they don't require a handle on another pidfd to pass to
> > open_by_handle_at() so it can derive the filesystem to decode in.
> >
> > Implement the exhaustive file handles for namespaces as well.
>
> I think you decide to split the "exhaustive" part to another patch,
> so better drop this paragraph?
Yes, good point. I've dont that.
> I am missing an explanation about the permissions for
> opening these file handles.
>
> My understanding of the code is that the opener needs to meet one of
> the conditions:
> 1. user has CAP_SYS_ADMIN in the userns owning the opened namespace
> 2. current task is in the opened namespace
Yes.
>
> But I do not fully understand the rationale behind the 2nd condition,
> that is, when is it useful?
A caller is always able to open a file descriptor to it's own set of
namespaces. File handles will behave the same way.
> And as far as I can tell, your selftest does not cover this condition
> (only both true or both false)?
I've added this now.
>
> I suggest to start with allowing only the useful and important
> cases, so if cond #1 is useful enough, drop cond #2 and we can add
> it later if needed and then your selftests already cover cond #1 true and false.
>
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
>
> After documenting the permissions, with ot without dropping cond #2
> feel free to add:
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Thanks!
>
> > ---
> > fs/nsfs.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> > include/linux/exportfs.h | 6 ++
> > 2 files changed, 182 insertions(+)
> >
> > diff --git a/fs/nsfs.c b/fs/nsfs.c
> > index 6f8008177133..a1585a2f4f03 100644
> > --- a/fs/nsfs.c
> > +++ b/fs/nsfs.c
> > @@ -13,6 +13,12 @@
> > #include <linux/nsfs.h>
> > #include <linux/uaccess.h>
> > #include <linux/mnt_namespace.h>
> > +#include <linux/ipc_namespace.h>
> > +#include <linux/time_namespace.h>
> > +#include <linux/utsname.h>
> > +#include <linux/exportfs.h>
> > +#include <linux/nstree.h>
> > +#include <net/net_namespace.h>
> >
> > #include "mount.h"
> > #include "internal.h"
> > @@ -417,12 +423,182 @@ static const struct stashed_operations nsfs_stashed_ops = {
> > .put_data = nsfs_put_data,
> > };
> >
> > +struct nsfs_fid {
> > + u64 ns_id;
> > + u32 ns_type;
> > + u32 ns_inum;
> > +} __attribute__ ((packed));
> > +
> > +#define NSFS_FID_SIZE (sizeof(struct nsfs_fid) / sizeof(u32))
> > +
> > +static int nsfs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
> > + struct inode *parent)
> > +{
> > + struct nsfs_fid *fid = (struct nsfs_fid *)fh;
> > + struct ns_common *ns = inode->i_private;
> > + int len = *max_len;
> > +
> > + /*
> > + * TODO:
> > + * For hierarchical namespaces we should start to encode the
> > + * parent namespace. Then userspace can walk a namespace
> > + * hierarchy purely based on file handles.
> > + */
> > + if (parent)
> > + return FILEID_INVALID;
> > +
> > + if (len < NSFS_FID_SIZE) {
> > + *max_len = NSFS_FID_SIZE;
> > + return FILEID_INVALID;
> > + }
> > +
> > + len = NSFS_FID_SIZE;
> > +
> > + fid->ns_id = ns->ns_id;
> > + fid->ns_type = ns->ops->type;
> > + fid->ns_inum = inode->i_ino;
> > + *max_len = len;
> > + return FILEID_NSFS;
> > +}
> > +
> > +static struct dentry *nsfs_fh_to_dentry(struct super_block *sb, struct fid *fh,
> > + int fh_len, int fh_type)
> > +{
> > + struct path path __free(path_put) = {};
> > + struct nsfs_fid *fid = (struct nsfs_fid *)fh;
> > + struct user_namespace *owning_ns = NULL;
> > + struct ns_common *ns;
> > + int ret;
> > +
> > + if (fh_len < NSFS_FID_SIZE)
> > + return NULL;
> > +
> > + switch (fh_type) {
> > + case FILEID_NSFS:
> > + break;
> > + default:
> > + return NULL;
> > + }
> > +
> > + scoped_guard(rcu) {
> > + ns = ns_tree_lookup_rcu(fid->ns_id, fid->ns_type);
> > + if (!ns)
> > + return NULL;
> > +
> > + VFS_WARN_ON_ONCE(ns->ns_id != fid->ns_id);
> > + VFS_WARN_ON_ONCE(ns->ops->type != fid->ns_type);
> > + VFS_WARN_ON_ONCE(ns->inum != fid->ns_inum);
> > +
> > + if (!refcount_inc_not_zero(&ns->count))
> > + return NULL;
> > + }
> > +
> > + switch (ns->ops->type) {
> > +#ifdef CONFIG_CGROUPS
> > + case CLONE_NEWCGROUP:
> > + if (!current_in_namespace(to_cg_ns(ns)))
> > + owning_ns = to_cg_ns(ns)->user_ns;
> > + break;
> > +#endif
> > +#ifdef CONFIG_IPC_NS
> > + case CLONE_NEWIPC:
> > + if (!current_in_namespace(to_ipc_ns(ns)))
> > + owning_ns = to_ipc_ns(ns)->user_ns;
> > + break;
> > +#endif
> > + case CLONE_NEWNS:
> > + if (!current_in_namespace(to_mnt_ns(ns)))
> > + owning_ns = to_mnt_ns(ns)->user_ns;
> > + break;
> > +#ifdef CONFIG_NET_NS
> > + case CLONE_NEWNET:
> > + if (!current_in_namespace(to_net_ns(ns)))
> > + owning_ns = to_net_ns(ns)->user_ns;
> > + break;
> > +#endif
> > +#ifdef CONFIG_PID_NS
> > + case CLONE_NEWPID:
> > + if (!current_in_namespace(to_pid_ns(ns))) {
> > + owning_ns = to_pid_ns(ns)->user_ns;
> > + } else if (!READ_ONCE(to_pid_ns(ns)->child_reaper)) {
> > + ns->ops->put(ns);
> > + return ERR_PTR(-EPERM);
> > + }
> > + break;
> > +#endif
> > +#ifdef CONFIG_TIME_NS
> > + case CLONE_NEWTIME:
> > + if (!current_in_namespace(to_time_ns(ns)))
> > + owning_ns = to_time_ns(ns)->user_ns;
> > + break;
> > +#endif
> > +#ifdef CONFIG_USER_NS
> > + case CLONE_NEWUSER:
> > + if (!current_in_namespace(to_user_ns(ns)))
> > + owning_ns = to_user_ns(ns);
> > + break;
> > +#endif
> > +#ifdef CONFIG_UTS_NS
> > + case CLONE_NEWUTS:
> > + if (!current_in_namespace(to_uts_ns(ns)))
> > + owning_ns = to_uts_ns(ns)->user_ns;
> > + break;
> > +#endif
> > + default:
> > + return ERR_PTR(-EOPNOTSUPP);
> > + }
> > +
> > + if (owning_ns && !ns_capable(owning_ns, CAP_SYS_ADMIN)) {
> > + ns->ops->put(ns);
> > + return ERR_PTR(-EPERM);
> > + }
> > +
> > + /* path_from_stashed() unconditionally consumes the reference. */
> > + ret = path_from_stashed(&ns->stashed, nsfs_mnt, ns, &path);
> > + if (ret)
> > + return ERR_PTR(ret);
> > +
> > + return no_free_ptr(path.dentry);
> > +}
> > +
> > +/*
> > + * Make sure that we reject any nonsensical flags that users pass via
> > + * open_by_handle_at().
> > + */
> > +#define VALID_FILE_HANDLE_OPEN_FLAGS \
> > + (O_RDONLY | O_WRONLY | O_RDWR | O_NONBLOCK | O_CLOEXEC | O_EXCL)
> > +
> > +static int nsfs_export_permission(struct handle_to_path_ctx *ctx,
> > + unsigned int oflags)
> > +{
> > + if (oflags & ~(VALID_FILE_HANDLE_OPEN_FLAGS | O_LARGEFILE))
> > + return -EINVAL;
> > +
> > + /* nsfs_fh_to_dentry() is performs further permission checks. */
> > + return 0;
> > +}
> > +
> > +static struct file *nsfs_export_open(struct path *path, unsigned int oflags)
> > +{
> > + /* Clear O_LARGEFILE as open_by_handle_at() forces it. */
> > + oflags &= ~O_LARGEFILE;
> > + return file_open_root(path, "", oflags, 0);
> > +}
> > +
> > +static const struct export_operations nsfs_export_operations = {
> > + .encode_fh = nsfs_encode_fh,
> > + .fh_to_dentry = nsfs_fh_to_dentry,
> > + .open = nsfs_export_open,
> > + .permission = nsfs_export_permission,
> > +};
> > +
> > static int nsfs_init_fs_context(struct fs_context *fc)
> > {
> > struct pseudo_fs_context *ctx = init_pseudo(fc, NSFS_MAGIC);
> > if (!ctx)
> > return -ENOMEM;
> > ctx->ops = &nsfs_ops;
> > + ctx->eops = &nsfs_export_operations;
> > ctx->dops = &ns_dentry_operations;
> > fc->s_fs_info = (void *)&nsfs_stashed_ops;
> > return 0;
> > diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> > index cfb0dd1ea49c..3aac58a520c7 100644
> > --- a/include/linux/exportfs.h
> > +++ b/include/linux/exportfs.h
> > @@ -122,6 +122,12 @@ enum fid_type {
> > FILEID_BCACHEFS_WITHOUT_PARENT = 0xb1,
> > FILEID_BCACHEFS_WITH_PARENT = 0xb2,
> >
> > + /*
> > + *
> > + * 64 bit namespace identifier, 32 bit namespace type, 32 bit inode number.
> > + */
> > + FILEID_NSFS = 0xf1,
> > +
> > /*
> > * 64 bit unique kernfs id
> > */
> >
> > --
> > 2.47.3
> >
next prev parent reply other threads:[~2025-09-11 9:31 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-10 14:36 [PATCH 00/32] ns: support file handles Christian Brauner
2025-09-10 14:36 ` [PATCH 01/32] pidfs: validate extensible ioctls Christian Brauner
2025-09-10 15:33 ` Jan Kara
2025-09-10 16:33 ` Aleksa Sarai
2025-10-23 10:46 ` Jiri Slaby
2025-10-24 22:31 ` Jan Kara
2025-11-26 9:08 ` Stability of ioctl constants in the UAPI (Re: [PATCH 01/32] pidfs: validate extensible ioctls) Florian Weimer
2025-11-26 11:08 ` Eugene Syromyatnikov
2025-11-26 11:47 ` Mark Wielaard
2025-09-10 14:36 ` [PATCH 02/32] nsfs: validate extensible ioctls Christian Brauner
2025-09-10 15:34 ` Jan Kara
2025-09-10 14:36 ` [PATCH 03/32] block: use extensible_ioctl_valid() Christian Brauner
2025-09-10 15:34 ` Jan Kara
2025-09-10 16:39 ` Jens Axboe
2025-09-10 14:36 ` [PATCH 04/32] ns: move to_ns_common() to ns_common.h Christian Brauner
2025-09-10 15:36 ` Jan Kara
2025-09-10 14:36 ` [PATCH 05/32] nsfs: add nsfs.h header Christian Brauner
2025-09-10 15:37 ` Jan Kara
2025-09-10 14:36 ` [PATCH 06/32] ns: uniformly initialize ns_common Christian Brauner
2025-09-10 15:40 ` Jan Kara
2025-09-10 14:36 ` [PATCH 07/32] mnt: use ns_common_init() Christian Brauner
2025-09-10 15:40 ` Jan Kara
2025-09-10 14:36 ` [PATCH 08/32] ipc: " Christian Brauner
2025-09-10 15:40 ` Jan Kara
2025-09-10 14:36 ` [PATCH 09/32] cgroup: " Christian Brauner
2025-09-10 15:42 ` Jan Kara
2025-09-10 14:36 ` [PATCH 10/32] pid: " Christian Brauner
2025-09-10 15:42 ` Jan Kara
2025-09-10 14:36 ` [PATCH 11/32] time: " Christian Brauner
2025-09-10 15:18 ` Thomas Gleixner
2025-09-10 15:44 ` Jan Kara
2025-09-10 14:36 ` [PATCH 12/32] uts: " Christian Brauner
2025-09-10 15:46 ` Jan Kara
2025-09-10 14:36 ` [PATCH 13/32] user: " Christian Brauner
2025-09-10 15:46 ` Jan Kara
2025-09-10 14:36 ` [PATCH 14/32] net: " Christian Brauner
2025-09-10 15:57 ` Jan Kara
2025-09-11 8:46 ` Christian Brauner
2025-09-11 9:19 ` Jan Kara
2025-09-10 21:07 ` Sasha Levin
2025-09-10 14:37 ` [PATCH 15/32] ns: remove ns_alloc_inum() Christian Brauner
2025-09-10 15:48 ` Jan Kara
2025-09-10 14:37 ` [PATCH 16/32] nstree: make iterator generic Christian Brauner
2025-09-10 14:37 ` [PATCH 17/32] mnt: support iterator Christian Brauner
2025-09-18 0:46 ` Askar Safin
2025-09-10 14:37 ` [PATCH 18/32] cgroup: " Christian Brauner
2025-09-10 16:48 ` Tejun Heo
2025-09-10 14:37 ` [PATCH 19/32] ipc: " Christian Brauner
2025-09-10 14:37 ` [PATCH 20/32] net: " Christian Brauner
2025-09-10 14:37 ` [PATCH 21/32] pid: " Christian Brauner
2025-09-10 14:37 ` [PATCH 22/32] time: " Christian Brauner
2025-09-10 15:19 ` Thomas Gleixner
2025-09-10 14:37 ` [PATCH 23/32] userns: " Christian Brauner
2025-09-10 14:37 ` [PATCH 24/32] uts: " Christian Brauner
2025-09-10 14:37 ` [PATCH 25/32] ns: add to_<type>_ns() to respective headers Christian Brauner
2025-09-10 16:35 ` Aleksa Sarai
2025-09-21 7:35 ` Thomas Gleixner
2025-09-10 14:37 ` [PATCH 26/32] nsfs: add current_in_namespace() Christian Brauner
2025-09-10 16:38 ` Aleksa Sarai
2025-09-10 14:37 ` [PATCH 27/32] nsfs: support file handles Christian Brauner
2025-09-10 17:21 ` Amir Goldstein
2025-09-11 9:31 ` Christian Brauner [this message]
2025-09-11 11:36 ` Amir Goldstein
2025-09-12 8:19 ` Christian Brauner
2025-09-12 9:12 ` Amir Goldstein
2025-09-18 3:40 ` Aleksa Sarai
2025-09-10 14:37 ` [PATCH 28/32] nsfs: support exhaustive " Christian Brauner
2025-09-10 17:07 ` Amir Goldstein
2025-09-10 14:37 ` [PATCH 29/32] nsfs: add missing id retrieval support Christian Brauner
2025-09-10 16:49 ` Aleksa Sarai
2025-09-11 7:52 ` Christian Brauner
2025-09-11 12:56 ` Aleksa Sarai
2025-09-10 14:37 ` [PATCH 30/32] tools: update nsfs.h uapi header Christian Brauner
2025-09-10 14:37 ` [PATCH 31/32] selftests/namespaces: add identifier selftests Christian Brauner
2025-09-10 14:37 ` [PATCH 32/32] selftests/namespaces: add file handle selftests Christian Brauner
2025-09-10 17:30 ` Amir Goldstein
2025-09-11 9:15 ` Christian Brauner
2025-09-11 11:48 ` Amir Goldstein
2025-09-10 21:46 ` Bart Van Assche
2025-09-11 8:59 ` Christian Brauner
2025-09-10 20:53 ` [syzbot ci] Re: ns: support file handles syzbot ci
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250911-werken-raubzug-64735473739c@brauner \
--to=brauner@kernel.org \
--cc=amir73il@gmail.com \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=chuck.lever@oracle.com \
--cc=cyphar@cyphar.com \
--cc=daan.j.demeyer@gmail.com \
--cc=edumazet@google.com \
--cc=hannes@cmpxchg.org \
--cc=horms@kernel.org \
--cc=jack@suse.cz \
--cc=jlayton@kernel.org \
--cc=josef@toxicpanda.com \
--cc=kuba@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=me@yhndnzj.com \
--cc=mkoutny@suse.com \
--cc=mzxreary@0pointer.de \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=zbyszek@in.waw.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox