From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL 06/16 for v7.2] kernel task_exec_state
Date: Fri, 12 Jun 2026 17:13:03 +0200 [thread overview]
Message-ID: <20260612-kernel-task_exec_state-v72-c39ca82510c0@brauner> (raw)
In-Reply-To: <20260612-vfs-v72-20facee87e19@brauner>
Hey Linus,
/* Summary */
This introduces a new per-task task_exec_state structure and relocates
the dumpable mode and the user namespace captured at execve() from
mm_struct onto it. It stays attached to the task for its full
lifetime.
__ptrace_may_access() and several /proc owner and visibility checks
need to consult two pieces of state for any observable task, including
zombies that have already gone through exit_mm(): the dumpable mode
and the user namespace captured at execve(). Both live on mm_struct
today, which exit_mm() clears from the task long before the task is
reaped. A reader that races with do_exit() observes task->mm == NULL
and either fails the check or falls back to init_user_ns - which
denies legitimate access to non-dumpable zombies that were running in
a nested user namespace.
mm_struct loses ->user_ns and the dumpability bits in ->flags.
MMF_DUMPABLE_BITS is reserved so the MMF_DUMP_FILTER_* layout exposed
via /proc/<pid>/coredump_filter stays stable. task->user_dumpable and
its exit_mm() snapshot are removed.
task_exec_state is the privilege domain established by an execve().
Within a thread group it is shared via refcount; across thread groups
each task has its own:
- CLONE_VM siblings (thread-group members, io_uring workers)
refcount-share the parent's exec_state.
- Non-CLONE_VM clones (fork(), vfork() without CLONE_VM) allocate a
fresh exec_state inheriting the parent's dumpable mode and user_ns.
- execve() in the child allocates a fresh instance and installs it
under task_lock + exec_update_lock via task_exec_state_replace().
- Credential changes (setresuid, capset, ...) and
prctl(PR_SET_DUMPABLE) update dumpability on the current task's
exec_state, i.e., on the thread group's shared instance.
On top of this exec_mmap() no longer tears down the old mm while
holding exec_update_lock for writing and cred_guard_mutex. Neither
lock is needed for that: exec_update_lock only exists to make the mm
swap atomic with the later commit_creds() and all its readers operate
on the new mm; none looks at the detached old mm. The cost was real:
__mmput() runs exit_mmap() over the entire old address space and can
block in exit_aio() waiting for in-flight AIO, so execve() of a large
process blocked ptrace_attach() and every exec_update_lock reader for
the duration of the teardown. The old mm is now stashed in
bprm->old_mm and released from setup_new_exec() after both locks are
dropped, with a backstop in free_bprm() for the error paths.
/* Testing */
gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
The following changes since commit 5200f5f493f79f14bbdc349e402a40dfb32f23c8:
Linux 7.1-rc4 (2026-05-17 13:59:58 -0700)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/kernel-7.2-rc1.task_exec_state
for you to fetch changes up to 38205ecbe6b6dc47968ad4e9c978e2117720969e:
exec: free the old mm outside the exec locks (2026-05-26 11:02:02 +0200)
----------------------------------------------------------------
kernel-7.2-rc1.task_exec_state
Please consider pulling these changes from the signed kernel-7.2-rc1.task_exec_state tag.
Thanks!
Christian
----------------------------------------------------------------
Christian Brauner (2):
Merge patch series "exec: introduce task_exec_state for exec-time metadata"
exec: free the old mm outside the exec locks
Christian Brauner (Amutable) (4):
sched/coredump: introduce enum task_dumpable
exec: introduce struct task_exec_state
ptrace: add ptracer_access_allowed()
exec_state: relocate dumpable information
arch/arm64/kernel/mte.c | 6 +-
drivers/firmware/efi/efi.c | 1 -
fs/coredump.c | 22 +++-----
fs/exec.c | 65 +++++++++++++--------
fs/pidfs.c | 23 +++-----
fs/proc/base.c | 39 ++++++-------
include/linux/binfmts.h | 3 +
include/linux/coredump.h | 4 ++
include/linux/mm_types.h | 9 ++-
include/linux/ptrace.h | 1 +
include/linux/sched.h | 6 +-
include/linux/sched/coredump.h | 47 ++++------------
include/linux/sched/exec_state.h | 31 ++++++++++
init/init_task.c | 10 ++++
kernel/Makefile | 2 +-
kernel/cred.c | 3 +-
kernel/exec_state.c | 119 +++++++++++++++++++++++++++++++++++++++
kernel/exit.c | 1 -
kernel/fork.c | 33 +++++++++--
kernel/kthread.c | 1 -
kernel/ptrace.c | 51 +++++++++++------
kernel/sys.c | 6 +-
mm/init-mm.c | 1 -
23 files changed, 329 insertions(+), 155 deletions(-)
create mode 100644 include/linux/sched/exec_state.h
create mode 100644 kernel/exec_state.c
next prev parent reply other threads:[~2026-06-12 15:13 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:11 ` [GIT PULL 02/16 for v7.2] vfs exportfs Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:12 ` [GIT PULL 03/16 for v7.2] vfs inode Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:12 ` [GIT PULL 04/16 for v7.2] vfs directory delegations Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:12 ` [GIT PULL 05/16 for v7.2] vfs casefold Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:13 ` Christian Brauner [this message]
2026-06-15 3:45 ` [GIT PULL 06/16 for v7.2] kernel task_exec_state pr-tracker-bot
2026-06-12 15:13 ` [GIT PULL 07/16 for v7.2] kernel misc Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:13 ` [GIT PULL 08/16 for v7.2] vfs openat2 Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:14 ` [GIT PULL 09/16 for v7.2] vfs super Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:14 ` [GIT PULL 10/16 for v7.2] vfs writeback Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:14 ` [GIT PULL 11/16 for v7.2] vfs bh Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:15 ` [GIT PULL 12/16 for v7.2] vfs eventpoll Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:15 ` [GIT PULL 13/16 for v7.2] vfs iomap Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:15 ` [GIT PULL 14/16 for v7.2] vfs xattr Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:16 ` [GIT PULL 15/16 for v7.2] vfs misc Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
2026-06-12 15:16 ` [GIT PULL 16/16 for v7.2] vfs procfs Christian Brauner
2026-06-15 3:45 ` pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260612-kernel-task_exec_state-v72-c39ca82510c0@brauner \
--to=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox