Linux Container Development
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: Richard Weinberger <richard-/L3Ra7n9ekc@public.gmane.org>,
	Serge Hallyn
	<serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>,
	"Eric W. Biederman"
	<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Subject: Re: Interaction user namespace, /proc/1 ownership & cap_set
Date: Mon, 1 Jul 2013 17:19:46 +0100	[thread overview]
Message-ID: <20130701161946.GR15954@redhat.com> (raw)
In-Reply-To: <20130701161625.GQ15954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Mon, Jul 01, 2013 at 05:16:25PM +0100, Daniel P. Berrange wrote:
> I'm struggling debugging a strange problem with interaction between user
> namespaces, cap_set and ownership of files in /proc/1/
> 
> I'm using a modified version (attached to this mail) of the demo program
> userns_child_exec.c linked on https://lwn.net/Articles/532593/
> 
>   $ gcc -lcap -Wall -o userns_child_exec userns_child_exec.c 
> 
> First normal execution appears to work just fine (as root):
> 
>   $ ./userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash
>   Launching child init
>   # umount /proc/sys/fs/binfmt_misc
>   # umount /proc/sys/fs/binfmt_misc
>   # umount /proc/fs/nfsd
>   # umount /proc
>   # mount -t proc proc /proc/
>   # ls -al /proc/1/environ 
>   -r--------. 1 root root 0 Jul  1 17:04 /proc/1/environ
> 
> 
> My modification adds support for a '-c' arg to call the program to use
> cap_set() from libcap.so in order to remove the CAP_SYS_MODULE capability.
> 
> If I run the program with the '-c' arg present, then the files in
> the /proc/1/ directory all end up owned by nfsnobody.nfsbody
> 
>   $ ./userns_child_exec -c -p -m -U -M '0 1000 1' -G '0 1000 1' bash
>   Launching child init
>   # umount /proc/sys/fs/binfmt_misc
>   # umount /proc/sys/fs/binfmt_misc
>   # umount /proc/fs/nfsd
>   # umount /proc
>   # mount -t proc proc /proc/
>   # ls -al /proc/1/environ 
>   -r--------. 1 nfsnobody nfsnobody 0 Jul  1 17:01 /proc/1/environ
> 
> Why on earth would calling 'cap_set()' to drop a capability cause
> the user/group ownership of files in /proc/1/ to change ?
> 
> Any child processes launched from this point get correct ownership
> on their /proc/NNN files - only /proc/1/ seems to be affected.
> 
> Via strace, we can see the libcap code only calls 3 syscalls:
> 
> capget({_LINUX_CAPABILITY_VERSION_3, 0}, NULL) = 0
> capget({_LINUX_CAPABILITY_VERSION_3, 0}, {CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SET
> UID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MO
> DULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_S
> YS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER
> |CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RA
> W|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_N
> ICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, 0}) = 0
> capset({_LINUX_CAPABILITY_VERSION_3, 0}, {CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_N
 ICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, 0}) = 0
> 
> though, for added fun, when running the demo program via strace
> the problem does not appear :-(
> 
> 
> 
> On a slightly related topic, I've noticed that it is not possible to
> invoke prctl(PR_CAPBSET_DROP) to clear the bounding set for processes
> inside a container. The kernel code uses capable() instead of ns_capable().
> Is this intended, or a missing conversion ?
> 
> Indeed, even ignoring namespaces for a minute, I'm curious as to why
> CAP_SETPCAP is required at all for PR_CAPBSET_DROP ?  Is it really
> a security risk to allow a non-privileged user to remove bits from
> the bounding set ? For KVM I'd like to be able to use PR_CAPBSET_DROP
> to prevent a compromised KVM process from using any setuid program to
> re-gain any kind of capabilities.  Similarly I think a container admin
> may well wish to make use of PR_CAPBSET_DROP to lock down applications
> there.


Opps, I should have mentioned that I'm using 3.9.4 kernel. Basically the
Fedora 3.9.4-303 build, but with CONFIG_XFS_FS=n and CONFIG_USER_NS=y
set in the Kconfig.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

  parent reply	other threads:[~2013-07-01 16:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-01 16:16 Interaction user namespace, /proc/1 ownership & cap_set Daniel P. Berrange
     [not found] ` <20130701161625.GQ15954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-01 16:19   ` Daniel P. Berrange [this message]
     [not found]     ` <20130701161946.GR15954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-01 16:24       ` Richard Weinberger
2013-07-02  5:14   ` Gao feng
     [not found]     ` <51D261D3.3030002-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-07-02  8:44       ` Eric W. Biederman
     [not found]         ` <87wqp9uz9a.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02  8:56           ` Richard Weinberger
     [not found]             ` <51D295C5.1080003-/L3Ra7n9ekc@public.gmane.org>
2013-07-02  9:25               ` Daniel P. Berrange
     [not found]                 ` <20130702092554.GD2524-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-02  9:45                   ` Richard Weinberger
2013-07-02  9:57                   ` Eric W. Biederman
     [not found]                     ` <87ehbhthbl.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 10:07                       ` Gao feng
     [not found]                         ` <51D2A649.9030102-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-07-02 16:35                           ` Eric W. Biederman
     [not found]                             ` <8761wsudgk.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 16:45                               ` Daniel P. Berrange
     [not found]                                 ` <20130702164514.GB2524-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-02 17:12                                   ` Eric W. Biederman
     [not found]                                     ` <87k3l8sx6l.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 20:24                                       ` Richard Weinberger
2013-07-09 10:35                                       ` Richard Weinberger
2013-07-12 10:04                                       ` Daniel P. Berrange

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130701161946.GR15954@redhat.com \
    --to=berrange-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=richard-/L3Ra7n9ekc@public.gmane.org \
    --cc=serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox