FS/XFS testing framework
 help / color / mirror / Atom feed
* [PATCH] vfs: fix dumpable race in create_userns_hierarchy()
@ 2026-03-23 13:32 Christian Brauner
  2026-05-15 16:24 ` Zorro Lang
  0 siblings, 1 reply; 2+ messages in thread
From: Christian Brauner @ 2026-03-23 13:32 UTC (permalink / raw)
  To: fstests; +Cc: Christian Brauner

All processes in the userns hierarchy share the same mm_struct due to
CLONE_VM. When a child calls setresuid() in switch_ids(), the kernel
clears the dumpable flag on the shared mm. The child immediately
re-sets it via prctl(PR_SET_DUMPABLE, 1), but there is a window
between the two where the parent may be opening /proc/PID/ns/user for
the previous level's child. That open hits ptrace_may_access() which
checks get_dumpable(mm) and, finding it zero, falls back to
ptrace_has_cap(mm->user_ns, mode). Since mm->user_ns is init_user_ns
(the mm was created by real root) and the parent is only ns-root, the
capability check fails and the open returns -EACCES.

Fix this by extending the existing two-way socketpair handshake into a
three-way handshake: the child signals the parent after becoming
dumpable, then blocks until the parent has opened /proc/PID/ns/user
and signals back. Only then does the child proceed to
create_userns_hierarchy() for the next level. This guarantees no
concurrent setresuid() can clear dumpable while the parent's open() is
in flight.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 src/vfs/utils.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/vfs/utils.c b/src/vfs/utils.c
index 52bb7e42..0b435afe 100644
--- a/src/vfs/utils.c
+++ b/src/vfs/utils.c
@@ -312,6 +312,18 @@ static int userns_fd_cb(void *data)
 	if (ret < 0)
 		return syserror("failure: write to socketpair");
 
+	/*
+	 * Wait for the parent to open our /proc/PID/ns/user before
+	 * proceeding to create the next userns level. Without this,
+	 * setresuid() in the next-level child clears the dumpable flag
+	 * on the shared mm (CLONE_VM) and the parent's open() fails the
+	 * ptrace_may_access() check with -EACCES because mm->user_ns is
+	 * init_user_ns where userns processes lack CAP_SYS_PTRACE.
+	 */
+	ret = read_nointr(h->fd_event, &c, 1);
+	if (ret < 0)
+		return syserror("failure: read from socketpair");
+
 	ret = create_userns_hierarchy(++h);
 	if (ret < 0)
 		return syserror("failure: userns level %d", h->level);
@@ -377,6 +389,14 @@ int create_userns_hierarchy(struct userns_hierarchy *h)
 		goto out_wait;
 	}
 
+	/* Tell the child it can now proceed to create the next level. */
+	bytes = write_nointr(fd_socket[0], "1", 1);
+	if (bytes < 0) {
+		kill(pid, SIGKILL);
+		syserror("failure: write to socketpair");
+		goto out_wait;
+	}
+
 	fret = 0;
 
 out_wait:

---
base-commit: 3ded3e13c008326d197d11ac975049ed1f8ec922
change-id: 20260323-fix-vfs-bugs-b4f8a492d6d7


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] vfs: fix dumpable race in create_userns_hierarchy()
  2026-03-23 13:32 [PATCH] vfs: fix dumpable race in create_userns_hierarchy() Christian Brauner
@ 2026-05-15 16:24 ` Zorro Lang
  0 siblings, 0 replies; 2+ messages in thread
From: Zorro Lang @ 2026-05-15 16:24 UTC (permalink / raw)
  To: Christian Brauner; +Cc: fstests

On Mon, Mar 23, 2026 at 02:32:15PM +0100, Christian Brauner wrote:
> All processes in the userns hierarchy share the same mm_struct due to
> CLONE_VM. When a child calls setresuid() in switch_ids(), the kernel
> clears the dumpable flag on the shared mm. The child immediately
> re-sets it via prctl(PR_SET_DUMPABLE, 1), but there is a window
> between the two where the parent may be opening /proc/PID/ns/user for
> the previous level's child. That open hits ptrace_may_access() which
> checks get_dumpable(mm) and, finding it zero, falls back to
> ptrace_has_cap(mm->user_ns, mode). Since mm->user_ns is init_user_ns
> (the mm was created by real root) and the parent is only ns-root, the
> capability check fails and the open returns -EACCES.
> 
> Fix this by extending the existing two-way socketpair handshake into a
> three-way handshake: the child signals the parent after becoming
> dumpable, then blocks until the parent has opened /proc/PID/ns/user
> and signals back. Only then does the child proceed to
> create_userns_hierarchy() for the next level. This guarantees no
> concurrent setresuid() can clear dumpable while the parent's open() is
> in flight.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---

Sorry, a few unforeseen things came up on my end and I lost track of this patch.

Reviewed-by: Zorro Lang <zlang@kernel.org>

>  src/vfs/utils.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/src/vfs/utils.c b/src/vfs/utils.c
> index 52bb7e42..0b435afe 100644
> --- a/src/vfs/utils.c
> +++ b/src/vfs/utils.c
> @@ -312,6 +312,18 @@ static int userns_fd_cb(void *data)
>  	if (ret < 0)
>  		return syserror("failure: write to socketpair");
>  
> +	/*
> +	 * Wait for the parent to open our /proc/PID/ns/user before
> +	 * proceeding to create the next userns level. Without this,
> +	 * setresuid() in the next-level child clears the dumpable flag
> +	 * on the shared mm (CLONE_VM) and the parent's open() fails the
> +	 * ptrace_may_access() check with -EACCES because mm->user_ns is
> +	 * init_user_ns where userns processes lack CAP_SYS_PTRACE.
> +	 */
> +	ret = read_nointr(h->fd_event, &c, 1);
> +	if (ret < 0)
> +		return syserror("failure: read from socketpair");
> +
>  	ret = create_userns_hierarchy(++h);
>  	if (ret < 0)
>  		return syserror("failure: userns level %d", h->level);
> @@ -377,6 +389,14 @@ int create_userns_hierarchy(struct userns_hierarchy *h)
>  		goto out_wait;
>  	}
>  
> +	/* Tell the child it can now proceed to create the next level. */
> +	bytes = write_nointr(fd_socket[0], "1", 1);
> +	if (bytes < 0) {
> +		kill(pid, SIGKILL);
> +		syserror("failure: write to socketpair");
> +		goto out_wait;
> +	}
> +
>  	fret = 0;
>  
>  out_wait:
> 
> ---
> base-commit: 3ded3e13c008326d197d11ac975049ed1f8ec922
> change-id: 20260323-fix-vfs-bugs-b4f8a492d6d7
> 
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-15 16:24 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 13:32 [PATCH] vfs: fix dumpable race in create_userns_hierarchy() Christian Brauner
2026-05-15 16:24 ` Zorro Lang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox