From: Christian Brauner <brauner@kernel.org>
To: linux-fsdevel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Seth Forshee <sforshee@kernel.org>,
Tycho Andersen <tycho@tycho.pizza>
Subject: Re: [PATCH 0/2] Move pidfd to tiny pseudo fs
Date: Tue, 13 Feb 2024 18:02:13 +0100 [thread overview]
Message-ID: <20240213-kippt-ambiente-9162a4f7b19b@brauner> (raw)
In-Reply-To: <20240213-vfs-pidfd_fs-v1-0-f863f58cfce1@kernel.org>
On Tue, Feb 13, 2024 at 05:45:45PM +0100, Christian Brauner wrote:
> Hey,
>
> This moves pidfds from the anonymous inode infrastructure to a tiny
> pseudo filesystem. This has been on my todo for quite a while as it will
> unblock further work that we weren't able to do so far simply because of
> the very justified limitations of anonymous inodes. So yesterday I sat
> down and wrote it down.
>
> Back when I added pidfds the concept was new (on Linux) and the
> limitations were acceptable but now it's starting to hurt us. And with
> the concept of pidfds having been around quite a while and being widely
> used this is worth doing. This makes it so that:
>
> * statx() on pidfds becomes useful for the first time.
> * pidfds can be compared simply via statx() for equality.
> * pidfds have unique inode numbers for the system lifetime.
> * struct pid is now stashed in inode->i_private instead of
> file->private_data. This means it is now possible to introduce
> concepts that operate on a process once all file descriptors have been
> closed. A concrete example is kill-on-last-close.
> * file->private_data is freed up for per-file options for pidfds.
> * Each struct pid will refer to a different inode but the same struct
> pid will refer to the same inode if it's opened multiple times. In
> contrast to now where each struct pid refers to the same inode. Even
> if we were to move to anon_inode_create_getfile() which creates new
> inodes we'd still be associating the same struct pid with multiple
> different inodes.
> * Pidfds now go through the regular dentry_open() path which means that
> all security hooks are called unblocking proper LSM management for
> pidfds. In addition fsnotify hooks are called and allow for listening
> to open events on pidfds.
>
> The tiny pseudo filesystem is not visible anywhere in userspace exactly
> like e.g., pipefs and sockfs. There's no lookup, there's no inode
> operations in general, so nothing complex. It's hopefully the best kind
> of dumb there is. Dentries and inodes are always deleted when the last
> pidfd is closed.
>
> I've made the new code optional and placed it under CONFIG_FS_PIDFD but
> I'm confident we can remove that very soon. This takes some inspiration
> from nsfs which uses a similar stashing mechanism.
>
> Thanks!
> Christian
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
>
> ---
> base-commit: 3f643cd2351099e6b859533b6f984463e5315e5f
> change-id: 20240212-vfs-pidfd_fs-9a6e49283d80
I forgot to mention that pidfds are explicitly not simply directory
inodes in procfs for various reasons so this isn't an option I want to
pursue. Integrating them into procfs would be a nasty level of
complexity that makes for very ugly and convoluted code. Especially how
this would need to be integrated into copy_process() and other
locations. It also poses significant security and permission checking
challenges to userspace because it is generally not safe to send around
file descriptors for /proc/<pid> directories. It's a pretty big attack
vector and cause of security issues. So really this is not a path that I
want to go down. It defeats the whole purpose of pidfds as opaque, easy
delegatable handles.
Oh, and tree is vfs.pidfd at the usual location
https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
prev parent reply other threads:[~2024-02-13 17:02 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 16:45 [PATCH 0/2] Move pidfd to tiny pseudo fs Christian Brauner
2024-02-13 16:45 ` [PATCH 1/2] pidfd: move struct pidfd_fops Christian Brauner
2024-02-13 16:45 ` [PATCH 2/2] pidfd: add pidfdfs Christian Brauner
2024-02-13 17:17 ` Linus Torvalds
2024-02-14 14:40 ` Christian Brauner
2024-02-14 18:27 ` Christian Brauner
2024-02-14 18:37 ` Linus Torvalds
2024-02-15 16:11 ` Christian Brauner
2024-02-16 11:50 ` Christian Brauner
2024-02-16 16:41 ` Christian Brauner
2024-02-17 13:59 ` Oleg Nesterov
2024-02-17 17:30 ` Linus Torvalds
2024-02-17 17:38 ` Linus Torvalds
2024-02-18 11:15 ` Christian Brauner
2024-02-18 11:33 ` Christian Brauner
2024-02-18 17:54 ` Christian Brauner
2024-02-18 18:08 ` Linus Torvalds
2024-02-18 18:57 ` Linus Torvalds
2024-02-19 18:05 ` Christian Brauner
2024-02-19 18:34 ` Linus Torvalds
2024-02-19 21:18 ` Christian Brauner
2024-02-19 23:24 ` Linus Torvalds
2024-02-18 14:27 ` Oleg Nesterov
2024-02-18 9:30 ` Christian Brauner
2024-02-22 19:03 ` Nathan Chancellor
2024-02-23 10:18 ` Heiko Carstens
2024-02-23 11:56 ` Christian Brauner
2024-02-23 11:55 ` Christian Brauner
2024-02-23 12:57 ` Heiko Carstens
2024-02-23 13:27 ` Christian Brauner
2024-02-23 13:35 ` Heiko Carstens
2024-02-23 13:41 ` Christian Brauner
2024-02-23 21:26 ` Christian Brauner
2024-02-23 21:58 ` Linus Torvalds
2024-02-24 5:52 ` Christian Brauner
2024-02-24 6:05 ` Christian Brauner
2024-02-24 18:48 ` Linus Torvalds
2024-02-24 19:15 ` Christian Brauner
2024-02-24 19:19 ` Christian Brauner
2024-02-24 19:21 ` Linus Torvalds
2024-02-27 19:26 ` Nathan Chancellor
2024-02-27 22:13 ` Christian Brauner
2024-03-12 10:35 ` Geert Uytterhoeven
2024-03-12 14:09 ` Christian Brauner
2024-05-15 11:10 ` Jiri Slaby
2024-05-15 16:39 ` Christian Brauner
2024-05-16 5:28 ` Jiri Slaby
2024-05-17 7:09 ` Jiri Slaby
2024-05-17 7:54 ` Jiri Slaby
2024-05-17 20:07 ` Linus Torvalds
2024-05-20 8:23 ` Jiri Slaby
2024-05-20 19:01 ` Linus Torvalds
2024-05-20 19:15 ` Linus Torvalds
2024-05-21 6:07 ` Jiri Slaby
2024-05-21 6:13 ` Jiri Slaby
2024-05-21 12:33 ` Christian Brauner
2024-05-21 12:40 ` Christian Brauner
2024-05-21 15:10 ` Linus Torvalds
2024-05-25 11:57 ` Christian Brauner
2024-05-21 12:16 ` Christian Brauner
2024-02-13 17:02 ` Christian Brauner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240213-kippt-ambiente-9162a4f7b19b@brauner \
--to=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=sforshee@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tycho@tycho.pizza \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).