* [GIT PULL] vfs pidfs
@ 2024-11-15 14:04 Christian Brauner
2024-11-18 19:49 ` pr-tracker-bot
0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2024-11-15 14:04 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel
Hey Linus,
/* Summary */
This adds a new ioctl to retrieve information about a pidfd.
A common pattern when using pidfds is having to get information about
the process, which currently requires /proc being mounted, resolving the
fd to a pid, and then do manual string parsing of /proc/N/status and
friends. This needs to be reimplemented over and over in all userspace
projects (e.g.: it has been reimplemented in systemd, dbus, dbus-daemon,
polkit so far), and requires additional care in checking that the fd is
still valid after having parsed the data, to avoid races.
Having a programmatic API that can be used directly removes all these
requirements, including having /proc mounted.
As discussed at LPC24, add an ioctl with an extensible struct so that
more parameters can be added later if needed. Start with returning
pid/tgid/ppid and some creds unconditionally, and cgroupid optionally.
/* Testing */
gcc version 14.2.0 (Debian 14.2.0-6)
Debian clang version 16.0.6 (27+b1)
All patches are based on v6.12-rc3 and have been sitting in linux-next.
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
No known conflicts.
The following changes since commit 8e929cb546ee42c9a61d24fae60605e9e3192354:
Linux 6.12-rc3 (2024-10-13 14:33:32 -0700)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.13.pidfs
for you to fetch changes up to cdda1f26e74bac732eca537a69f19f6a37b641be:
pidfd: add ioctl to retrieve pid info (2024-10-24 13:54:51 +0200)
Please consider pulling these changes from the signed vfs-6.13.pidfs tag.
Thanks!
Christian
----------------------------------------------------------------
vfs-6.13.pidfs
----------------------------------------------------------------
Luca Boccassi (1):
pidfd: add ioctl to retrieve pid info
fs/pidfs.c | 86 ++++++++++++++++++++++++-
include/uapi/linux/pidfd.h | 50 ++++++++++++++
tools/testing/selftests/pidfd/pidfd_open_test.c | 82 ++++++++++++++++++++++-
3 files changed, 214 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] vfs pidfs
2024-11-15 14:04 Christian Brauner
@ 2024-11-18 19:49 ` pr-tracker-bot
0 siblings, 0 replies; 6+ messages in thread
From: pr-tracker-bot @ 2024-11-18 19:49 UTC (permalink / raw)
To: Christian Brauner
Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel
The pull request you sent on Fri, 15 Nov 2024 15:04:33 +0100:
> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.13.pidfs
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/909d3b571e5a77aef0949818de1efda129dcddbd
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* [GIT PULL] vfs pidfs
@ 2025-01-18 13:00 Christian Brauner
2025-01-20 18:59 ` pr-tracker-bot
0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2025-01-18 13:00 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel
Hey Linus,
/* Summary */
This contains pidfs updates for this cycle:
- Rework inode number allocation
Recently we received a patchset that aims to enable file handle
encoding and decoding via name_to_handle_at(2) and
open_by_handle_at(2).
A crucical step in the patch series is how to go from inode number to
struct pid without leaking information into unprivileged contexts. The
issue is that in order to find a struct pid the pid number in the
initial pid namespace must be encoded into the file handle via
name_to_handle_at(2).
This can be used by containers using a separate pid namespace to learn
what the pid number of a given process in the initial pid namespace
is. While this is a weak information leak it could be used in various
exploits and in general is an ugly wart in the design.
To solve this problem a new way is needed to lookup a struct pid based
on the inode number allocated for that struct pid. The other part is
to remove the custom inode number allocation on 32bit systems that is
also an ugly wart that should go away.
Allocate unique identifiers for struct pid by simply incrementing a 64
bit counter and insert each struct pid into the rbtree so it can be
looked up to decode file handles avoiding to leak actual pids across
pid namespaces in file handles.
On both 64 bit and 32 bit the same 64 bit identifier is used to lookup
struct pid in the rbtree. On 64 bit the unique identifier for struct pid
simply becomes the inode number. Comparing two pidfds continues to be as
simple as comparing inode numbers.
On 32 bit the 64 bit number assigned to struct pid is split into two 32
bit numbers. The lower 32 bits are used as the inode number and the
upper 32 bits are used as the inode generation number. Whenever a
wraparound happens on 32 bit the 64 bit number will be incremented by 2
so inode numbering starts at 2 again.
When a wraparound happens on 32 bit multiple pidfds with the same inode
number are likely to exist. This isn't a problem since before pidfs
pidfds used the anonymous inode meaning all pidfds had the same inode
number. On 32 bit sserspace can thus reconstruct the 64 bit identifier
by retrieving both the inode number and the inode generation number to
compare, or use file handles. This gives the same guarantees on both 32
bit and 64 bit.
- Implement file handle support
This is based on custom export operation methods which allows pidfs to
implement permission checking and opening of pidfs file handles
cleanly without hacking around in the core file handle code too much.
- Support bind-mounts
Allow bind-mounting pidfds. Similar to nsfs let's allow bind-mounts
for pidfds. This allows pidfds to be safely recovered and checked for
process recycling.
Instead of checking d_ops for both nsfs and pidfs we could in a
follow-up patch add a flag argument to struct dentry_operations that
functions similar to file_operations->fop_flags.
/* Testing */
gcc version 14.2.0 (Debian 14.2.0-6)
Debian clang version 16.0.6 (27+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
No known conflicts.
The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37:
Linux 6.13-rc1 (2024-12-01 14:28:56 -0800)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.pidfs
for you to fetch changes up to 3781680fba3eab0b34b071cb9443fd5ad92d23cf:
Merge patch series "pidfs: support bind-mounts" (2024-12-22 11:03:19 +0100)
Please consider pulling these changes from the signed vfs-6.14-rc1.pidfs tag.
Thanks!
Christian
----------------------------------------------------------------
vfs-6.14-rc1.pidfs
----------------------------------------------------------------
Christian Brauner (16):
pidfs: rework inode number allocation
pidfs: remove 32bit inode number handling
pidfs: support FS_IOC_GETVERSION
Merge patch series "pidfs: file handle preliminaries"
fhandle: simplify error handling
exportfs: add open method
fhandle: pull CAP_DAC_READ_SEARCH check into may_decode_fh()
exportfs: add permission method
pidfs: implement file handle support
Merge patch series "pidfs: implement file handle support"
pidfs: check for valid ioctl commands
selftests/pidfd: add pidfs file handle selftests
pidfs: lookup pid through rbtree
pidfs: allow bind-mounts
selftests: add pidfd bind-mount tests
Merge patch series "pidfs: support bind-mounts"
Erin Shepherd (1):
pseudofs: add support for export_ops
fs/fhandle.c | 115 +++--
fs/libfs.c | 1 +
fs/namespace.c | 10 +-
fs/pidfs.c | 298 ++++++++++--
include/linux/exportfs.h | 20 +
include/linux/pid.h | 2 +
include/linux/pidfs.h | 3 +
include/linux/pseudo_fs.h | 1 +
kernel/pid.c | 14 +-
tools/testing/selftests/pidfd/.gitignore | 2 +
tools/testing/selftests/pidfd/Makefile | 3 +-
tools/testing/selftests/pidfd/pidfd.h | 39 ++
tools/testing/selftests/pidfd/pidfd_bind_mount.c | 188 ++++++++
.../selftests/pidfd/pidfd_file_handle_test.c | 503 +++++++++++++++++++++
tools/testing/selftests/pidfd/pidfd_setns_test.c | 47 +-
tools/testing/selftests/pidfd/pidfd_wait.c | 47 +-
16 files changed, 1110 insertions(+), 183 deletions(-)
create mode 100644 tools/testing/selftests/pidfd/pidfd_bind_mount.c
create mode 100644 tools/testing/selftests/pidfd/pidfd_file_handle_test.c
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] vfs pidfs
2025-01-18 13:00 [GIT PULL] vfs pidfs Christian Brauner
@ 2025-01-20 18:59 ` pr-tracker-bot
0 siblings, 0 replies; 6+ messages in thread
From: pr-tracker-bot @ 2025-01-20 18:59 UTC (permalink / raw)
To: Christian Brauner
Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel
The pull request you sent on Sat, 18 Jan 2025 14:00:30 +0100:
> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.pidfs
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/5f85bd6aeceaecd0ff3a5ee827bf75eb6141ad55
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* [GIT PULL] vfs pidfs
@ 2025-03-22 10:13 Christian Brauner
2025-03-24 21:01 ` pr-tracker-bot
0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2025-03-22 10:13 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel
Hey Linus,
/* Summary */
This contains updates for pidfs:
- Allow retrieving exit information after a process has been reaped
through pidfds via the new PIDFD_INTO_EXIT extension for the
PIDFD_GET_INFO ioctl. Various tools need access to information about a
process/task even after it has already been reaped.
Pidfd polling allows waiting on either task exit or for a task to have
been reaped. The contract for PIDFD_INFO_EXIT is simply that EPOLLHUP
must be observed before exit information can be retrieved, i.e., exit
information is only provided once the task has been reaped and then
can be retrieved as long as the pidfd is open.
- Add PIDFD_SELF_{THREAD,THREAD_GROUP} sentinels allowing userspace to forgo
allocating a file descriptor for their own process. This is useful in
scenarios where users want to act on their own process through pidfds and is
akin to AT_FDCWD.
- Improve premature thread-group leader and subthread exec behavior when
polling on pidfds:
(1) During a multi-threaded exec by a subthread, i.e., non-thread-group
leader thread, all other threads in the thread-group including the
thread-group leader are killed and the struct pid of the
thread-group leader will be taken over by the subthread that called
exec. IOW, two tasks change their TIDs.
(2) A premature thread-group leader exit means that the thread-group
leader exited before all of the other subthreads in the thread-group
have exited.
Both cases lead to inconsistencies for pidfd polling with PIDFD_THREAD.
Any caller that holds a PIDFD_THREAD pidfd to the current thread-group
leader may or may not see an exit notification on the file descriptor
depending on when poll is performed. If the poll is performed before the
exec of the subthread has concluded an exit notification is generated
for the old thread-group leader. If the poll is performed after the exec
of the subthread has concluded no exit notification is generated for the
old thread-group leader.
The correct behavior is to simply not generate an exit notification on
the struct pid of a subhthread exec because the struct pid is taken
over by the subthread and thus remains alive.
But this is difficult to handle because a thread-group may exit
premature as mentioned in (2). In that case an exit notification is
reliably generated but the subthreads may continue to run for an
indeterminate amount of time and thus also may exec at some point.
After this pull no exit notifications will be generated for a
PIDFD_THREAD pidfd for a thread-group leader until all subthreads have
been reaped. If a subthread should exec before no exit notification
will be generated until that task exits or it creates subthreads and
repeates the cycle.
This means an exit notification indicates the ability for the father
to reap the child.
/* Testing */
gcc version 14.2.0 (Debian 14.2.0-6)
Debian clang version 16.0.6 (27+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
No known conflicts.
The following changes since commit 2014c95afecee3e76ca4a56956a936e23283f05b:
Linux 6.14-rc1 (2025-02-02 15:39:26 -0800)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.15-rc1.pidfs
for you to fetch changes up to d40dc30c7b7c80db2100b73ac26d39c362643a39:
Merge patch series "pidfs: handle multi-threaded exec and premature thread-group leader exit" (2025-03-20 15:32:51 +0100)
Please consider pulling these changes from the signed vfs-6.15-rc1.pidfs tag.
Thanks!
Christian
----------------------------------------------------------------
vfs-6.15-rc1.pidfs
----------------------------------------------------------------
Christian Brauner (25):
selftests/pidfd: add new PIDFD_SELF* defines
Merge patch series "introduce PIDFD_SELF* sentinels"
pidfs: switch to copy_struct_to_user()
pidfd: rely on automatic cleanup in __pidfd_prepare()
pidfs: move setting flags into pidfs_alloc_file()
pidfs: use private inode slab cache
pidfs: record exit code and cgroupid at exit
pidfs: allow to retrieve exit information
selftests/pidfd: fix header inclusion
pidfs/selftests: ensure correct headers for ioctl handling
selftests/pidfd: expand common pidfd header
selftests/pidfd: add first PIDFD_INFO_EXIT selftest
selftests/pidfd: add second PIDFD_INFO_EXIT selftest
selftests/pidfd: add third PIDFD_INFO_EXIT selftest
selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest
Merge patch series "pidfs: provide information after task has been reaped"
pidfs: ensure that PIDFS_INFO_EXIT is available
pidfs: improve multi-threaded exec and premature thread-group leader exit polling
selftests/pidfd: first test for multi-threaded exec polling
selftests/pidfd: second test for multi-threaded exec polling
selftests/pidfd: third test for multi-threaded exec polling
Merge patch series "pidfs: handle multi-threaded exec and premature thread-group leader exit"
Lorenzo Stoakes (3):
pidfd: add PIDFD_SELF* sentinels to refer to own thread/process
selftests/pidfd: add tests for PIDFD_SELF_*
selftests/mm: use PIDFD_SELF in guard pages test
fs/internal.h | 1 +
fs/libfs.c | 4 +-
fs/pidfs.c | 247 +++++++-
include/linux/pidfs.h | 1 +
include/uapi/linux/pidfd.h | 31 +-
kernel/exit.c | 8 +-
kernel/fork.c | 22 +-
kernel/pid.c | 24 +-
kernel/signal.c | 108 ++--
tools/testing/selftests/mm/guard-pages.c | 16 +-
tools/testing/selftests/pidfd/.gitignore | 2 +
tools/testing/selftests/pidfd/Makefile | 4 +-
tools/testing/selftests/pidfd/pidfd.h | 109 ++++
tools/testing/selftests/pidfd/pidfd_exec_helper.c | 12 +
tools/testing/selftests/pidfd/pidfd_fdinfo_test.c | 1 +
tools/testing/selftests/pidfd/pidfd_info_test.c | 692 ++++++++++++++++++++++
tools/testing/selftests/pidfd/pidfd_open_test.c | 30 +-
tools/testing/selftests/pidfd/pidfd_setns_test.c | 45 --
tools/testing/selftests/pidfd/pidfd_test.c | 76 ++-
19 files changed, 1241 insertions(+), 192 deletions(-)
create mode 100644 tools/testing/selftests/pidfd/pidfd_exec_helper.c
create mode 100644 tools/testing/selftests/pidfd/pidfd_info_test.c
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] vfs pidfs
2025-03-22 10:13 Christian Brauner
@ 2025-03-24 21:01 ` pr-tracker-bot
0 siblings, 0 replies; 6+ messages in thread
From: pr-tracker-bot @ 2025-03-24 21:01 UTC (permalink / raw)
To: Christian Brauner
Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel
The pull request you sent on Sat, 22 Mar 2025 11:13:37 +0100:
> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.15-rc1.pidfs
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/df00ded23a6b4df888237333b1f86067d24113b2
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-24 21:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-18 13:00 [GIT PULL] vfs pidfs Christian Brauner
2025-01-20 18:59 ` pr-tracker-bot
-- strict thread matches above, loose matches on Subject: below --
2025-03-22 10:13 Christian Brauner
2025-03-24 21:01 ` pr-tracker-bot
2024-11-15 14:04 Christian Brauner
2024-11-18 19:49 ` pr-tracker-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox