From: Christian Brauner <brauner@kernel.org>
To: Jann Horn <jannh@google.com>,
Linus Torvalds <torvalds@linuxfoundation.org>,
Oleg Nesterov <oleg@redhat.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Qualys Security Advisory <qsa@qualys.com>,
Kees Cook <kees@kernel.org>, Minchan Kim <minchan@kernel.org>,
linux-mm@kvack.org, Suren Baghdasaryan <surenb@google.com>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <liam@infradead.org>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>, Michal Hocko <mhocko@suse.com>,
"Christian Brauner (Amutable)" <brauner@kernel.org>
Subject: [PATCH RFC v2 5/5] cred: switch dumpability lowering to task_exec_state
Date: Wed, 20 May 2026 16:42:58 +0200 [thread overview]
Message-ID: <20260520-work-task_exec_state-v2-5-9ea88ceb09e6@kernel.org> (raw)
In-Reply-To: <20260520-work-task_exec_state-v2-0-9ea88ceb09e6@kernel.org>
commit_creds() has historically called set_dumpable(suid_dumpable) on
every effective uid/gid/cap change, paired with an smp_wmb()/smp_rmb()
fence against __ptrace_may_access() reading the credentials.
Switch the call to task_exec_state_set_dumpable() so the dumpability
lowering targets the new per-task exec_state rather than mm->flags.
Drop the open-coded "if (task->mm)" guard - exec_state is always
allocated for any observable task - and drop the explicit
smp_wmb()/smp_rmb() pair: the new model relies on RCU acquire/release
on the cred pointer. WRITE_ONCE() on es->dumpable inside
task_exec_state_set_dumpable() happens-before rcu_assign_pointer() of
the new cred in commit_creds(), so a reader that observes the new
cred via rcu_dereference(task->real_cred) in __ptrace_may_access() is
guaranteed to observe the new dumpable via READ_ONCE(es->dumpable).
The same-uid ptrace shedding and /proc visibility behavior that
long-running daemons launched as root (sshd, dbus-daemon, polkitd,
NetworkManager, postfix workers, ...) rely on when they setresuid()
to a service uid is preserved. No userspace audit cycle is required.
Behavioral change: dumpability propagates across the fork subtree
=================================================================
exec_state is refcount-shared across every clone() variant - thread,
fork(), vfork(), io_uring worker - so this write is observed by every
task still sharing the same exec_state. Pre-series, set_dumpable()
targeted mm->flags, which was per-mm: shared by CLONE_VM threads but
private to fork()-without-CLONE_VM children.
Under the new model a privilege drop in any task in the subtree
lowers dumpability for the entire subtree, including non-CLONE_VM
siblings. This matches the model the series codifies: the entire
fork subtree of one execve shares one exec_state, and dumpability is
a property of that domain.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
kernel/cred.c | 25 ++++++++++++-------------
kernel/ptrace.c | 10 ----------
2 files changed, 12 insertions(+), 23 deletions(-)
diff --git a/kernel/cred.c b/kernel/cred.c
index 51c35ac94787..335d8da1c43b 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -378,25 +378,24 @@ int commit_creds(struct cred *new)
get_cred(new); /* we will require a ref for the subj creds too */
- /* dumpability changes */
+ /*
+ * Lower dumpability on euid/egid/fsuid/fsgid/capability changes.
+ * Long-running daemons launched as root (sshd, dbus-daemon,
+ * polkitd, NetworkManager, postfix workers, ...) rely on this to
+ * shed /proc visibility and same-uid ptrace exposure of
+ * root-acquired secrets when they setresuid() to a service uid.
+ *
+ * exec_state is shared across the whole fork subtree of the
+ * establishing execve(), so this write is observed by every task
+ * still sharing the same exec_state.
+ */
if (!uid_eq(old->euid, new->euid) ||
!gid_eq(old->egid, new->egid) ||
!uid_eq(old->fsuid, new->fsuid) ||
!gid_eq(old->fsgid, new->fsgid) ||
!cred_cap_issubset(old, new)) {
- if (task->mm)
- task_exec_state_set_dumpable(suid_dumpable);
+ task_exec_state_set_dumpable(suid_dumpable);
task->pdeath_signal = 0;
- /*
- * If a task drops privileges and becomes nondumpable,
- * the dumpability change must become visible before
- * the credential change; otherwise, a __ptrace_may_access()
- * racing with this change may be able to attach to a task it
- * shouldn't be able to attach to (as if the task had dropped
- * privileges without becoming nondumpable).
- * Pairs with a read barrier in __ptrace_may_access().
- */
- smp_wmb();
}
/* alter the thread keyring */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index a4932ef716c6..c340a741e76a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -356,16 +356,6 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
return -EPERM;
ok:
rcu_read_unlock();
- /*
- * If a task drops privileges and becomes nondumpable (through a syscall
- * like setresuid()) while we are trying to access it, we must ensure
- * that the dumpability is read after the credentials; otherwise,
- * we may be able to attach to a task that we shouldn't be able to
- * attach to (as if the task had dropped privileges without becoming
- * nondumpable).
- * Pairs with a write barrier in commit_creds().
- */
- smp_rmb();
if (!task_still_dumpable(task, mode))
return -EPERM;
--
2.47.3
next prev parent reply other threads:[~2026-05-20 14:43 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 14:42 [PATCH RFC v2 0/5] ptrace: keep mm metadata accessible past exit_mm() Christian Brauner
2026-05-20 14:42 ` [PATCH RFC v2 1/5] sched/coredump: introduce enum task_dumpable Christian Brauner
2026-05-20 16:27 ` Jann Horn
2026-05-20 14:42 ` [PATCH RFC v2 2/5] exec: introduce struct task_exec_state and relocate dumpable Christian Brauner
2026-05-20 15:14 ` Linus Torvalds
2026-05-20 15:24 ` Christian Brauner
2026-05-20 16:27 ` Jann Horn
2026-05-20 19:47 ` Christian Brauner
2026-05-20 14:42 ` [PATCH RFC v2 3/5] ptrace: add ptracer_access_allowed() Christian Brauner
2026-05-20 16:28 ` Jann Horn
2026-05-20 14:42 ` [PATCH RFC v2 4/5] exec_state: relocate dumpable information Christian Brauner
2026-05-20 19:21 ` Jann Horn
2026-05-20 19:47 ` Christian Brauner
2026-05-20 14:42 ` Christian Brauner [this message]
2026-05-20 18:44 ` [PATCH RFC v2 5/5] cred: switch dumpability lowering to task_exec_state Jann Horn
2026-05-20 15:08 ` [PATCH RFC v2 0/5] ptrace: keep mm metadata accessible past exit_mm() Christian Brauner
2026-05-20 16:27 ` Jann Horn
2026-05-20 16:52 ` Linus Torvalds
2026-05-20 16:55 ` Linus Torvalds
2026-05-20 18:09 ` Jann Horn
2026-05-20 18:12 ` Linus Torvalds
2026-05-20 19:46 ` Christian Brauner
2026-05-20 17:29 ` Jann Horn
2026-05-20 18:11 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520-work-task_exec_state-v2-5-9ea88ceb09e6@kernel.org \
--to=brauner@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=jannh@google.com \
--cc=kees@kernel.org \
--cc=liam@infradead.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=oleg@redhat.com \
--cc=qsa@qualys.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=torvalds@linuxfoundation.org \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox