From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Moore <paul@paul-moore.com>, Eric Paris <eparis@redhat.com>,
Christian Brauner <brauner@kernel.org>,
linux-kernel@vger.kernel.org, audit@vger.kernel.org,
Richard Guy Briggs <rgb@redhat.com>,
Ricardo Robaina <rrobaina@redhat.com>,
Waiman Long <llong@redhat.com>
Subject: [PATCH][RFC] bug in unshare(2) failure recovery
Date: Sat, 7 Feb 2026 08:25:24 +0000 [thread overview]
Message-ID: <20260207082524.GE3183987@ZenIV> (raw)
In-Reply-To: <5cb07c57-9dca-4086-af88-f866f765c7fb@redhat.com>
On Fri, Feb 06, 2026 at 03:04:53PM -0500, Waiman Long wrote:
[summary of subthread: there's an unpleasant corner case in unshare(2),
when we have a CLONE_NEWNS in flags and current->fs hadn't been shared
at all; in that case copy_mnt_ns() gets passed current->fs instead of
a private copy, which causes interesting warts in proof of correctness]
> I guess if private means fs->users == 1, the condition could still be true.
Unfortunately, it's worse than just a convoluted proof of correctness.
Consider the case when we have CLONE_NEWCGROUP in addition to CLONE_NEWNS
(and current->fs->users == 1).
We pass current->fs to copy_mnt_ns(), all right. Suppose it succeeds and
flips current->fs->{pwd,root} to corresponding locations in the new namespace.
Now we proceed to copy_cgroup_ns(), which fails (e.g. with -ENOMEM).
We call put_mnt_ns() on the namespace created by copy_mnt_ns(), it's
destroyed and its mount tree is dissolved, but... current->fs->root and
current->fs->pwd are both left pointing to now detached mounts.
They are pinning those, so it's not a UAF, but it leaves the calling
process with unshare(2) failing with -ENOMEM _and_ leaving it with
pwd and root on detached isolated mounts. The last part is clearly a bug.
There is other fun related to that mess (races with pivot_root(), including
the one between pivot_root() and fork(), of all things), but this one
is easy to isolate and fix - treat CLONE_NEWNS as "allocate a new
fs_struct even if it hadn't been shared in the first place". Sure, we could
go for something like "if both CLONE_NEWNS *and* one of the things that might
end up failing after copy_mnt_ns() call in create_new_namespaces() are set,
force allocation of new fs_struct", but let's keep it simple - the cost
of copy_fs_struct() is trivial.
Another benefit is that copy_mnt_ns() with CLONE_NEWNS *always* gets
a freshly allocated fs_struct, yet to be attached to anything. That
seriously simplifies the analysis...
FWIW, that bug had been there since the introduction of unshare(2) ;-/
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/kernel/fork.c b/kernel/fork.c
index b1f3915d5f8e..68ccbaea7398 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -3082,7 +3082,7 @@ static int unshare_fs(unsigned long unshare_flags, struct fs_struct **new_fsp)
return 0;
/* don't need lock here; in the worst case we'll do useless copy */
- if (fs->users == 1)
+ if (!(unshare_flags & CLONE_NEWNS) && fs->users == 1)
return 0;
*new_fsp = copy_fs_struct(fs);
next prev parent reply other threads:[~2026-02-07 8:23 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-03 19:44 [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths Waiman Long
2026-02-03 19:59 ` Al Viro
2026-02-03 20:18 ` Waiman Long
2026-02-03 20:05 ` Al Viro
2026-02-03 20:32 ` Waiman Long
2026-02-03 21:50 ` Al Viro
2026-02-03 23:26 ` Al Viro
2026-02-04 4:21 ` Waiman Long
2026-02-04 6:26 ` Al Viro
2026-02-04 18:16 ` Waiman Long
2026-02-04 20:18 ` Al Viro
2026-02-05 3:03 ` Waiman Long
2026-02-05 4:45 ` Waiman Long
2026-02-05 23:53 ` Al Viro
2026-02-06 1:20 ` Waiman Long
2026-02-06 4:11 ` Waiman Long
2026-02-06 4:19 ` Waiman Long
2026-02-06 5:22 ` Al Viro
2026-02-06 6:31 ` Al Viro
2026-02-06 6:38 ` Al Viro
2026-02-06 7:13 ` Al Viro
2026-02-06 19:16 ` Waiman Long
2026-02-06 20:04 ` Waiman Long
2026-02-06 20:38 ` Al Viro
2026-02-07 8:25 ` Al Viro [this message]
2026-02-07 23:06 ` [PATCH][RFC] bug in unshare(2) failure recovery Waiman Long
2026-02-17 12:49 ` Christian Brauner
2026-02-17 12:49 ` Christian Brauner
2026-02-06 20:29 ` [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths Al Viro
2026-02-06 20:58 ` setns(2) vs. pivot_root(2) (was Re: [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths) Al Viro
2026-02-06 21:09 ` Al Viro
2026-02-17 13:12 ` Christian Brauner
2026-02-06 8:15 ` [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths Al Viro
2026-02-05 5:22 ` Al Viro
2026-02-05 13:59 ` Waiman Long
2026-02-05 17:53 ` Mateusz Guzik
2026-02-17 13:33 ` Christian Brauner
2026-02-17 13:44 ` Mateusz Guzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260207082524.GE3183987@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=audit@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=eparis@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=llong@redhat.com \
--cc=paul@paul-moore.com \
--cc=rgb@redhat.com \
--cc=rrobaina@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.