From: "Christian Brauner (Amutable)" <brauner@kernel.org>
To: Jann Horn <jannh@google.com>,
Linus Torvalds <torvalds@linuxfoundation.org>,
Oleg Nesterov <oleg@redhat.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Qualys Security Advisory <qsa@qualys.com>,
Kees Cook <kees@kernel.org>, Minchan Kim <minchan@kernel.org>,
linux-mm@kvack.org, Suren Baghdasaryan <surenb@google.com>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <liam@infradead.org>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>, Michal Hocko <mhocko@suse.com>,
"Christian Brauner (Amutable)" <brauner@kernel.org>
Subject: [PATCH RFC v3 0/4] exec: introduce task_exec_state for exec-time metadata
Date: Wed, 20 May 2026 23:48:51 +0200 [thread overview]
Message-ID: <20260520-work-task_exec_state-v3-0-69f895bc1385@kernel.org> (raw)
This series relocates the dumpable mode and the user_namespace
captured at execve() from mm_struct onto a new per-task
task_exec_state structure that stays attached to the task for its
full lifetime.
__ptrace_may_access() and several /proc owner / visibility checks
need to consult two pieces of state for any observable task,
including zombies that have already gone through exit_mm(): the
dumpable mode and the user namespace captured at execve(). Both
live on mm_struct today, which exit_mm() clears from the task long
before the task is reaped.
A reader that races with do_exit() observes task->mm == NULL and
either fails the check or falls back to init_user_ns - which denies
legitimate access to non-dumpable zombies that were running in a
nested user namespace.
mm_struct loses ->user_ns and the dumpability bits in ->flags.
MMF_DUMPABLE_BITS is reserved so MMF_DUMP_FILTER_* layout exposed via
/proc/<pid>/coredump_filter stays stable. task->user_dumpable and its
exit_mm() snapshot are removed.
task_exec_state is the privilege domain established by an execve()
[1]. Within a thread group it is shared via refcount; across thread
groups each task has its own:
- CLONE_VM siblings (thread-group members, io_uring workers)
refcount-share the parent's exec_state.
- Non-CLONE_VM clones (fork(), vfork() without CLONE_VM)
allocate a fresh exec_state inheriting the parent's dumpable
mode and user_ns.
- execve() in the child allocates a fresh instance and installs
it under task_lock + exec_update_lock via
task_exec_state_replace().
- Credential changes (setresuid, capset, ...) and
prctl(PR_SET_DUMPABLE) update dumpability on the current
task's exec_state, i.e. on the thread group's shared instance.
Behavioral change:
Kernel threads that briefly use a user mm via kthread_use_mm() no
longer inherit dumpability from the borrowed mm. Kthreads are not
ptraceable (PF_KTHREAD short-circuits __ptrace_may_access), so this
is observable only via /proc surfaces that a sufficiently privileged
reader can reach.
[1] https://lore.kernel.org/r/CAHk-=wj+NgoDH3GSicJ140SV8OoDd71pLmL3fgFEsTcgoMC6Og@mail.gmail.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
Changes in v3:
- Restore alloc-fresh-and-inherit semantics for non-CLONE_VM clones.
CLONE_VM siblings still refcount-share; fork() and other
non-CLONE_VM clones get a fresh exec_state that inherits the
parent's dumpable mode and user_ns. The v2 "every clone
refcount-shares" model would have let any forked process in an
Android zygote64 subtree influence dumpability of its siblings
via prctl(PR_SET_DUMPABLE).
- Link to v2: https://patch.msgid.link/20260520-work-task_exec_state-v2-0-9ea88ceb09e6@kernel.org
Changes in v2:
- Drop dup-on-fork for non-CLONE_VM clones: every clone() variant
refcount-shares the parent's task_exec_state; only execve()
allocates a fresh one. See "Behavioral changes" in the cover
letter for the implications.
- Switch commit_creds() to update dumpability on the new
task_exec_state (instead of dropping the set_dumpable() call
entirely as in v1). Drops the explicit smp_wmb()/smp_rmb() pair
- RCU acquire/release on the cred pointer provides the ordering.
- Link to v1: https://patch.msgid.link/20260516-work-exit_mm-v1-1-76bcc7c2439d@kernel.org
---
Christian Brauner (Amutable) (4):
sched/coredump: introduce enum task_dumpable
exec: introduce struct task_exec_state
ptrace: add ptracer_access_allowed()
exec_state: relocate dumpable information
arch/arm64/kernel/mte.c | 6 +-
drivers/firmware/efi/efi.c | 1 -
fs/coredump.c | 22 +++-----
fs/exec.c | 39 ++++++-------
fs/pidfs.c | 23 +++-----
fs/proc/base.c | 39 ++++++-------
include/linux/binfmts.h | 2 +
include/linux/coredump.h | 4 ++
include/linux/mm_types.h | 9 ++-
include/linux/ptrace.h | 1 +
include/linux/sched.h | 6 +-
include/linux/sched/coredump.h | 47 ++++------------
include/linux/sched/exec_state.h | 29 ++++++++++
init/init_task.c | 10 ++++
kernel/Makefile | 2 +-
kernel/cred.c | 3 +-
kernel/exec_state.c | 116 +++++++++++++++++++++++++++++++++++++++
kernel/exit.c | 1 -
kernel/fork.c | 32 +++++++++--
kernel/kthread.c | 1 -
kernel/ptrace.c | 53 ++++++++++++------
kernel/sys.c | 6 +-
mm/init-mm.c | 1 -
23 files changed, 301 insertions(+), 152 deletions(-)
---
base-commit: ab5fce87a778cb780a05984a2ca448f2b41aafbf
change-id: 20260520-work-task_exec_state-83209d8b3e53
next reply other threads:[~2026-05-20 21:49 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 21:48 Christian Brauner (Amutable) [this message]
2026-05-20 21:48 ` [PATCH RFC v3 1/4] sched/coredump: introduce enum task_dumpable Christian Brauner (Amutable)
2026-05-22 22:14 ` David Hildenbrand (Arm)
2026-05-20 21:48 ` [PATCH RFC v3 2/4] exec: introduce struct task_exec_state Christian Brauner (Amutable)
2026-05-22 15:00 ` Oleg Nesterov
2026-05-26 7:16 ` Christian Brauner
2026-05-26 8:17 ` Oleg Nesterov
2026-05-22 22:21 ` David Hildenbrand (Arm)
2026-05-20 21:48 ` [PATCH RFC v3 3/4] ptrace: add ptracer_access_allowed() Christian Brauner (Amutable)
2026-05-22 15:08 ` Oleg Nesterov
2026-05-22 22:32 ` David Hildenbrand (Arm)
2026-05-20 21:48 ` [PATCH RFC v3 4/4] exec_state: relocate dumpable information Christian Brauner (Amutable)
2026-05-21 10:05 ` Christian Brauner
2026-05-21 11:16 ` Jann Horn
2026-05-21 13:08 ` Christian Brauner
2026-05-26 13:07 ` David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520-work-task_exec_state-v3-0-69f895bc1385@kernel.org \
--to=brauner@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=jannh@google.com \
--cc=kees@kernel.org \
--cc=liam@infradead.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=oleg@redhat.com \
--cc=qsa@qualys.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=torvalds@linuxfoundation.org \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.