public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: fstests@vger.kernel.org
Cc: Christian Brauner <brauner@kernel.org>
Subject: [PATCH] vfs: fix dumpable race in create_userns_hierarchy()
Date: Mon, 23 Mar 2026 14:32:15 +0100	[thread overview]
Message-ID: <20260323-fix-vfs-bugs-v1-1-76a6aed624d9@kernel.org> (raw)

All processes in the userns hierarchy share the same mm_struct due to
CLONE_VM. When a child calls setresuid() in switch_ids(), the kernel
clears the dumpable flag on the shared mm. The child immediately
re-sets it via prctl(PR_SET_DUMPABLE, 1), but there is a window
between the two where the parent may be opening /proc/PID/ns/user for
the previous level's child. That open hits ptrace_may_access() which
checks get_dumpable(mm) and, finding it zero, falls back to
ptrace_has_cap(mm->user_ns, mode). Since mm->user_ns is init_user_ns
(the mm was created by real root) and the parent is only ns-root, the
capability check fails and the open returns -EACCES.

Fix this by extending the existing two-way socketpair handshake into a
three-way handshake: the child signals the parent after becoming
dumpable, then blocks until the parent has opened /proc/PID/ns/user
and signals back. Only then does the child proceed to
create_userns_hierarchy() for the next level. This guarantees no
concurrent setresuid() can clear dumpable while the parent's open() is
in flight.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 src/vfs/utils.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/vfs/utils.c b/src/vfs/utils.c
index 52bb7e42..0b435afe 100644
--- a/src/vfs/utils.c
+++ b/src/vfs/utils.c
@@ -312,6 +312,18 @@ static int userns_fd_cb(void *data)
 	if (ret < 0)
 		return syserror("failure: write to socketpair");
 
+	/*
+	 * Wait for the parent to open our /proc/PID/ns/user before
+	 * proceeding to create the next userns level. Without this,
+	 * setresuid() in the next-level child clears the dumpable flag
+	 * on the shared mm (CLONE_VM) and the parent's open() fails the
+	 * ptrace_may_access() check with -EACCES because mm->user_ns is
+	 * init_user_ns where userns processes lack CAP_SYS_PTRACE.
+	 */
+	ret = read_nointr(h->fd_event, &c, 1);
+	if (ret < 0)
+		return syserror("failure: read from socketpair");
+
 	ret = create_userns_hierarchy(++h);
 	if (ret < 0)
 		return syserror("failure: userns level %d", h->level);
@@ -377,6 +389,14 @@ int create_userns_hierarchy(struct userns_hierarchy *h)
 		goto out_wait;
 	}
 
+	/* Tell the child it can now proceed to create the next level. */
+	bytes = write_nointr(fd_socket[0], "1", 1);
+	if (bytes < 0) {
+		kill(pid, SIGKILL);
+		syserror("failure: write to socketpair");
+		goto out_wait;
+	}
+
 	fret = 0;
 
 out_wait:

---
base-commit: 3ded3e13c008326d197d11ac975049ed1f8ec922
change-id: 20260323-fix-vfs-bugs-b4f8a492d6d7


                 reply	other threads:[~2026-03-23 13:32 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260323-fix-vfs-bugs-v1-1-76a6aed624d9@kernel.org \
    --to=brauner@kernel.org \
    --cc=fstests@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox