Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Jann Horn <jannh@google.com>,
	 Linus Torvalds <torvalds@linuxfoundation.org>,
	 Oleg Nesterov <oleg@redhat.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Qualys Security Advisory <qsa@qualys.com>,
	Kees Cook <kees@kernel.org>,  Minchan Kim <minchan@kernel.org>,
	linux-mm@kvack.org,  Suren Baghdasaryan <surenb@google.com>,
	Lorenzo Stoakes <ljs@kernel.org>,
	 "Liam R. Howlett" <liam@infradead.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	 Mike Rapoport <rppt@kernel.org>, Michal Hocko <mhocko@suse.com>,
	 "Christian Brauner (Amutable)" <brauner@kernel.org>
Subject: [PATCH RFC v2 5/5] cred: switch dumpability lowering to task_exec_state
Date: Wed, 20 May 2026 16:42:58 +0200	[thread overview]
Message-ID: <20260520-work-task_exec_state-v2-5-9ea88ceb09e6@kernel.org> (raw)
In-Reply-To: <20260520-work-task_exec_state-v2-0-9ea88ceb09e6@kernel.org>

commit_creds() has historically called set_dumpable(suid_dumpable) on
every effective uid/gid/cap change, paired with an smp_wmb()/smp_rmb()
fence against __ptrace_may_access() reading the credentials.

Switch the call to task_exec_state_set_dumpable() so the dumpability
lowering targets the new per-task exec_state rather than mm->flags.
Drop the open-coded "if (task->mm)" guard - exec_state is always
allocated for any observable task - and drop the explicit
smp_wmb()/smp_rmb() pair: the new model relies on RCU acquire/release
on the cred pointer.  WRITE_ONCE() on es->dumpable inside
task_exec_state_set_dumpable() happens-before rcu_assign_pointer() of
the new cred in commit_creds(), so a reader that observes the new
cred via rcu_dereference(task->real_cred) in __ptrace_may_access() is
guaranteed to observe the new dumpable via READ_ONCE(es->dumpable).

The same-uid ptrace shedding and /proc visibility behavior that
long-running daemons launched as root (sshd, dbus-daemon, polkitd,
NetworkManager, postfix workers, ...) rely on when they setresuid()
to a service uid is preserved.  No userspace audit cycle is required.

Behavioral change: dumpability propagates across the fork subtree
=================================================================

exec_state is refcount-shared across every clone() variant - thread,
fork(), vfork(), io_uring worker - so this write is observed by every
task still sharing the same exec_state.  Pre-series, set_dumpable()
targeted mm->flags, which was per-mm: shared by CLONE_VM threads but
private to fork()-without-CLONE_VM children.

Under the new model a privilege drop in any task in the subtree
lowers dumpability for the entire subtree, including non-CLONE_VM
siblings.  This matches the model the series codifies: the entire
fork subtree of one execve shares one exec_state, and dumpability is
a property of that domain.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 kernel/cred.c   | 25 ++++++++++++-------------
 kernel/ptrace.c | 10 ----------
 2 files changed, 12 insertions(+), 23 deletions(-)

diff --git a/kernel/cred.c b/kernel/cred.c
index 51c35ac94787..335d8da1c43b 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -378,25 +378,24 @@ int commit_creds(struct cred *new)
 
 	get_cred(new); /* we will require a ref for the subj creds too */
 
-	/* dumpability changes */
+	/*
+	 * Lower dumpability on euid/egid/fsuid/fsgid/capability changes.
+	 * Long-running daemons launched as root (sshd, dbus-daemon,
+	 * polkitd, NetworkManager, postfix workers, ...) rely on this to
+	 * shed /proc visibility and same-uid ptrace exposure of
+	 * root-acquired secrets when they setresuid() to a service uid.
+	 *
+	 * exec_state is shared across the whole fork subtree of the
+	 * establishing execve(), so this write is observed by every task
+	 * still sharing the same exec_state.
+	 */
 	if (!uid_eq(old->euid, new->euid) ||
 	    !gid_eq(old->egid, new->egid) ||
 	    !uid_eq(old->fsuid, new->fsuid) ||
 	    !gid_eq(old->fsgid, new->fsgid) ||
 	    !cred_cap_issubset(old, new)) {
-		if (task->mm)
-			task_exec_state_set_dumpable(suid_dumpable);
+		task_exec_state_set_dumpable(suid_dumpable);
 		task->pdeath_signal = 0;
-		/*
-		 * If a task drops privileges and becomes nondumpable,
-		 * the dumpability change must become visible before
-		 * the credential change; otherwise, a __ptrace_may_access()
-		 * racing with this change may be able to attach to a task it
-		 * shouldn't be able to attach to (as if the task had dropped
-		 * privileges without becoming nondumpable).
-		 * Pairs with a read barrier in __ptrace_may_access().
-		 */
-		smp_wmb();
 	}
 
 	/* alter the thread keyring */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index a4932ef716c6..c340a741e76a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -356,16 +356,6 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 	return -EPERM;
 ok:
 	rcu_read_unlock();
-	/*
-	 * If a task drops privileges and becomes nondumpable (through a syscall
-	 * like setresuid()) while we are trying to access it, we must ensure
-	 * that the dumpability is read after the credentials; otherwise,
-	 * we may be able to attach to a task that we shouldn't be able to
-	 * attach to (as if the task had dropped privileges without becoming
-	 * nondumpable).
-	 * Pairs with a write barrier in commit_creds().
-	 */
-	smp_rmb();
 	if (!task_still_dumpable(task, mode))
 		return -EPERM;
 

-- 
2.47.3



  parent reply	other threads:[~2026-05-20 14:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20 14:42 [PATCH RFC v2 0/5] ptrace: keep mm metadata accessible past exit_mm() Christian Brauner
2026-05-20 14:42 ` [PATCH RFC v2 1/5] sched/coredump: introduce enum task_dumpable Christian Brauner
2026-05-20 16:27   ` Jann Horn
2026-05-20 14:42 ` [PATCH RFC v2 2/5] exec: introduce struct task_exec_state and relocate dumpable Christian Brauner
2026-05-20 15:14   ` Linus Torvalds
2026-05-20 15:24     ` Christian Brauner
2026-05-20 16:27   ` Jann Horn
2026-05-20 19:47     ` Christian Brauner
2026-05-20 14:42 ` [PATCH RFC v2 3/5] ptrace: add ptracer_access_allowed() Christian Brauner
2026-05-20 16:28   ` Jann Horn
2026-05-20 14:42 ` [PATCH RFC v2 4/5] exec_state: relocate dumpable information Christian Brauner
2026-05-20 19:21   ` Jann Horn
2026-05-20 19:47     ` Christian Brauner
2026-05-20 14:42 ` Christian Brauner [this message]
2026-05-20 18:44   ` [PATCH RFC v2 5/5] cred: switch dumpability lowering to task_exec_state Jann Horn
2026-05-20 15:08 ` [PATCH RFC v2 0/5] ptrace: keep mm metadata accessible past exit_mm() Christian Brauner
2026-05-20 16:27 ` Jann Horn
2026-05-20 16:52   ` Linus Torvalds
2026-05-20 16:55     ` Linus Torvalds
2026-05-20 18:09       ` Jann Horn
2026-05-20 18:12         ` Linus Torvalds
2026-05-20 19:46           ` Christian Brauner
2026-05-20 17:29     ` Jann Horn
2026-05-20 18:11       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260520-work-task_exec_state-v2-5-9ea88ceb09e6@kernel.org \
    --to=brauner@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=jannh@google.com \
    --cc=kees@kernel.org \
    --cc=liam@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=oleg@redhat.com \
    --cc=qsa@qualys.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=torvalds@linuxfoundation.org \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox