Util-Linux package development
 help / color / mirror / Atom feed
* Re: Bug#1134639: nsenter -t 1 -m escapes mount and pid namespaces
       [not found] <aejkWHDXmpCX7Gh7@smyga.hemma>
@ 2026-06-03 18:17 ` Chris Hofstaedtler
  2026-06-04  4:03   ` Christian Albrecht Goeschel Ndjomouo
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Hofstaedtler @ 2026-06-03 18:17 UTC (permalink / raw)
  To: util-linux; +Cc: Ralph Ronnquist, 1134639

Hi Ralph,

sorry for the late reply. I am not an expert on namespaces, and have 
thus forwarded your bug to the upstream mailing list.

On Thu, Apr 23, 2026 at 01:08:08AM +1000, Ralph Ronnquist wrote:
> I observed this ina simple test setup, with on ordinary filesystem
> built with {debootstrap --variant=minbase sid FS ...}
> 
> First: {unshare -m -p -f chroot FS} will change root into that
> filesystem with unshared mount and pid namespaces.
> 
> Next: {mount -t proc proc /proc} will mount the procfs for that pid
> namespace. We see with {ls -l /proc/1/ns/mnt} the identity of the
> unshared mount namespace, which is different from the identity before
> chroot.
> 
> But: {nsenter -t 1 -m -- ls -l /proc/1/ns/mnt} shows the identity of
> the host mount namespace -- the outer namespace.
> 
> Thus {nsenter -t 1 -m} "escapes" from the unshared namespace to the
> containing namespace. And for example: {nsenter -t 1 -m /bin/sh}
> starts a shell in the outer mount and pid namespace(s)!
> 
> This seems to be a severe bug.
> 
> Apparently {nsenter -t 1 -m} finds pid 1 in the outer namespace rather
> than in the call pid namespace.

Hopefully someone from upstream can shed a light :-)

Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug#1134639: nsenter -t 1 -m escapes mount and pid namespaces
  2026-06-03 18:17 ` Bug#1134639: nsenter -t 1 -m escapes mount and pid namespaces Chris Hofstaedtler
@ 2026-06-04  4:03   ` Christian Albrecht Goeschel Ndjomouo
  2026-06-04 10:15     ` Ralph Ronnquist
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Albrecht Goeschel Ndjomouo @ 2026-06-04  4:03 UTC (permalink / raw)
  To: Chris Hofstaedtler, util-linux@vger.kernel.org
  Cc: Ralph Ronnquist, 1134639@bugs.debian.org

> First: {unshare -m -p -f chroot FS} will change root into that
> filesystem with unshared mount and pid namespaces.
>

This will successfully changes the root directory path of the child process,
however, the newly created mount namespace's root mount will still
point to the host's root filesystem, which is the actual root cause of the
escape (it'll become clearer below).

> Next: {mount -t proc proc /proc} will mount the procfs for that pid
> namespace. We see with {ls -l /proc/1/ns/mnt} the identity of the
> unshared mount namespace, which is different from the identity before
> chroot.
>

As the mount(8) command has copied the execution context of the container
process, it will see it's root filesystem as `FS`, so the 'procfs' will be mounted
on FS/proc, rightfully so. The ls command is also running with that context,
and will show the container's mount namespace ID.

> But: {nsenter -t 1 -m -- ls -l /proc/1/ns/mnt} shows the identity of
> the host mount namespace -- the outer namespace.
>
> Thus {nsenter -t 1 -m} "escapes" from the unshared namespace to the
> containing namespace. And for example: {nsenter -t 1 -m /bin/sh}
> starts a shell in the outer mount and pid namespace(s)!
>

The reason why you escaped is that when nsenter(1) calls setns(fd, CLONE_NEWNS)
, the kernel will set the root filesystem for the calling process to the absolute root of
the target mount namespace. And, whatever binary it forks will now be decoupled
from the container's chroot and point back to the host's root filesystem. This is why
you are also able to view the host's mount table or resolve paths relative to the host
fs while inside the container, for example, when you executed a shell with nsenter(8).

If you wish to completely cut ties with the VFS structure of the host, you can make use
of pivot_root(8). It let's you set the global root mount of the mount namespace and truly
isolates the mount namespace.

You can do something like this:

$ unshare --mount --pid --fork
$ mount --bind FS FS/
$ cd FS/
$ mkdir -p old_root/
$ /sbin/pivot_root . old_root/
$ cd /
$ mount -t proc proc /proc
$ umount -l old_root/
$ rmdir old_root

You should then be able to see the exact same mnt namespace ID.

$ ls -l /proc/1/ns/mnt
[...] /proc/1/ns/mnt -> 'mnt:[4026533461]'
$ nsenter --mount --target 1 -- ls -l /proc/1/ns/mnt
[...] /proc/1/ns/mnt -> 'mnt:[4026533461]'


Maybe Karel has more to say about this.

Anyways I hope this cleared up at least some of the confusion.


Christian Goeschel Ndjomouo




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug#1134639: nsenter -t 1 -m escapes mount and pid namespaces
  2026-06-04  4:03   ` Christian Albrecht Goeschel Ndjomouo
@ 2026-06-04 10:15     ` Ralph Ronnquist
  0 siblings, 0 replies; 3+ messages in thread
From: Ralph Ronnquist @ 2026-06-04 10:15 UTC (permalink / raw)
  To: Christian Albrecht Goeschel Ndjomouo
  Cc: Chris Hofstaedtler, util-linux@vger.kernel.org,
	1134639@bugs.debian.org

On Thu, Jun 04, 2026 at 04:03:44AM +0000, Christian Albrecht Goeschel Ndjomouo wrote:
> > First: {unshare -m -p -f chroot FS} will change root into that
> > filesystem with unshared mount and pid namespaces.
> >
> 
> This will successfully changes the root directory path of the child process,
> however, the newly created mount namespace's root mount will still
> point to the host's root filesystem, which is the actual root cause of the
> escape (it'll become clearer below).
> 
> > Next: {mount -t proc proc /proc} will mount the procfs for that pid
> > namespace. We see with {ls -l /proc/1/ns/mnt} the identity of the
> > unshared mount namespace, which is different from the identity before
> > chroot.
> >
> 
> As the mount(8) command has copied the execution context of the container
> process, it will see it's root filesystem as `FS`, so the 'procfs' will be mounted
> on FS/proc, rightfully so. The ls command is also running with that context,
> and will show the container's mount namespace ID.
> 
> > But: {nsenter -t 1 -m -- ls -l /proc/1/ns/mnt} shows the identity of
> > the host mount namespace -- the outer namespace.
> >
> > Thus {nsenter -t 1 -m} "escapes" from the unshared namespace to the
> > containing namespace. And for example: {nsenter -t 1 -m /bin/sh}
> > starts a shell in the outer mount and pid namespace(s)!
> >
> 
> The reason why you escaped is that when nsenter(1) calls setns(fd, CLONE_NEWNS)
> , the kernel will set the root filesystem for the calling process to the absolute root of
> the target mount namespace. And, whatever binary it forks will now be decoupled
> from the container's chroot and point back to the host's root filesystem. This is why
> you are also able to view the host's mount table or resolve paths relative to the host
> fs while inside the container, for example, when you executed a shell with nsenter(8).
> 
> If you wish to completely cut ties with the VFS structure of the host, you can make use
> of pivot_root(8). It let's you set the global root mount of the mount namespace and truly
> isolates the mount namespace.
> 
> You can do something like this:
> 
> $ unshare --mount --pid --fork
> $ mount --bind FS FS/
> $ cd FS/
> $ mkdir -p old_root/
> $ /sbin/pivot_root . old_root/
> $ cd /
> $ mount -t proc proc /proc
> $ umount -l old_root/
> $ rmdir old_root
> 
> You should then be able to see the exact same mnt namespace ID.
> 
> $ ls -l /proc/1/ns/mnt
> [...] /proc/1/ns/mnt -> 'mnt:[4026533461]'
> $ nsenter --mount --target 1 -- ls -l /proc/1/ns/mnt
> [...] /proc/1/ns/mnt -> 'mnt:[4026533461]'
> 
> 
> Maybe Karel has more to say about this.
> 
> Anyways I hope this cleared up at least some of the confusion.

Quite subtile, but I can confirm also in my actual setting (which is a
simple and plain "overlay-boot" example).

I will need a couple of sleeps before I fully grasp that "absolute
root" notion. However the recepie you outline does bring the desired
effect of eliminating that namespace eascape for me.

Thanks.

Ralph


> 
> 
> Christian Goeschel Ndjomouo
> 
> 
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-04 10:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <aejkWHDXmpCX7Gh7@smyga.hemma>
2026-06-03 18:17 ` Bug#1134639: nsenter -t 1 -m escapes mount and pid namespaces Chris Hofstaedtler
2026-06-04  4:03   ` Christian Albrecht Goeschel Ndjomouo
2026-06-04 10:15     ` Ralph Ronnquist

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox