All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Waiman Long <llong@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>, Eric Paris <eparis@redhat.com>,
	Christian Brauner <brauner@kernel.org>,
	linux-kernel@vger.kernel.org, audit@vger.kernel.org,
	Richard Guy Briggs <rgb@redhat.com>,
	Ricardo Robaina <rrobaina@redhat.com>
Subject: setns(2) vs. pivot_root(2) (was Re: [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths)
Date: Fri, 6 Feb 2026 20:58:04 +0000	[thread overview]
Message-ID: <20260206205804.GC3183987@ZenIV> (raw)
In-Reply-To: <20260206202933.GA3183987@ZenIV>

On Fri, Feb 06, 2026 at 08:29:33PM +0000, Al Viro wrote:

> Look: the case where we might get passed current->fs down there is real.
> It can happen in one and only one situation - CLONE_NEWNS in unshare(2)
> arguments *and* current->fs->users being 1.
> 
> It wouldn't suffice, since there's chroot_fs_refs() that doesn't give
> a rat's arse for task->fs being ours - it goes and replaces every
> ->fs->pwd or ->fs->root that happens to point to old_root.
> 
> It's still not a real race, though - both chroot_fs_refs() and that area
> in copy_mnt_ns() are serialized on namespace_sem.
> 
> And yes, it's obscenely byzantine.  It gets even worse when you consider
> the fact that pivot_root(2) does not break only because the refcount
> drops in chroot_fs_refs() are guaranteed not to reach 0 - the caller is
> holding its own references to old_root.{mnt,dentry} and *thar* does not
> get dropped until we drop namespace_sem.
> 
> IOW, that shit is actually safe, but man, has its correctness grown fucking
> convoluted...
> 
> Grabbing fs->seq in copy_mnt_ns() wouldn't make the things better, though -
> it seriously relies upon the same exclusion with chroot_fs_refs() for
> correctness; unless you are willing to hold it over the entire walk through
> the mount tree, the proof of correctness doesn't get any simpler.

Speaking of the race that _is_ there: pidfd setns() vs. pivot_root().
pivot_root() (well, chroot_fs_refs()) goes over all threads and flips their
->fs->{root,pwd} for the ones that used to be at old_root.  The trouble is,
in case where we have setns() with more than just CLONE_NEWNS in flags, we
end up creating a temporary fs_struct, passing that to mntns_install() and
then copying its pwd and root back to the caller's if everything goes well.

That temporary is _not_ going to be found by chroot_fs_refs(), though, so
it misses the update by pivot_root().

  reply	other threads:[~2026-02-06 20:56 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-03 19:44 [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths Waiman Long
2026-02-03 19:59 ` Al Viro
2026-02-03 20:18   ` Waiman Long
2026-02-03 20:05 ` Al Viro
2026-02-03 20:32   ` Waiman Long
2026-02-03 21:50     ` Al Viro
2026-02-03 23:26       ` Al Viro
2026-02-04  4:21         ` Waiman Long
2026-02-04  6:26           ` Al Viro
2026-02-04 18:16             ` Waiman Long
2026-02-04 20:18               ` Al Viro
2026-02-05  3:03                 ` Waiman Long
2026-02-05  4:45                   ` Waiman Long
2026-02-05 23:53                     ` Al Viro
2026-02-06  1:20                       ` Waiman Long
2026-02-06  4:11                         ` Waiman Long
2026-02-06  4:19                           ` Waiman Long
2026-02-06  5:22                           ` Al Viro
2026-02-06  6:31                             ` Al Viro
2026-02-06  6:38                               ` Al Viro
2026-02-06  7:13                             ` Al Viro
2026-02-06 19:16                             ` Waiman Long
2026-02-06 20:04                               ` Waiman Long
2026-02-06 20:38                                 ` Al Viro
2026-02-07  8:25                                 ` [PATCH][RFC] bug in unshare(2) failure recovery Al Viro
2026-02-07 23:06                                   ` Waiman Long
2026-02-17 12:49                                   ` Christian Brauner
2026-02-17 12:49                                   ` Christian Brauner
2026-02-06 20:29                               ` [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths Al Viro
2026-02-06 20:58                                 ` Al Viro [this message]
2026-02-06 21:09                                   ` setns(2) vs. pivot_root(2) (was Re: [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths) Al Viro
2026-02-17 13:12                                     ` Christian Brauner
2026-02-06  8:15                       ` [PATCH v2] audit: Avoid excessive dput/dget in audit_context setup and reset paths Al Viro
2026-02-05  5:22                   ` Al Viro
2026-02-05 13:59                     ` Waiman Long
2026-02-05 17:53                     ` Mateusz Guzik
2026-02-17 13:33                       ` Christian Brauner
2026-02-17 13:44                         ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260206205804.GC3183987@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=audit@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=eparis@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=rgb@redhat.com \
    --cc=rrobaina@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.