From: "Daniel P. Berrange" <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: Richard Weinberger <richard-/L3Ra7n9ekc@public.gmane.org>,
Serge Hallyn
<serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>,
"Eric W. Biederman"
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Subject: Re: Interaction user namespace, /proc/1 ownership & cap_set
Date: Mon, 1 Jul 2013 17:19:46 +0100 [thread overview]
Message-ID: <20130701161946.GR15954@redhat.com> (raw)
In-Reply-To: <20130701161625.GQ15954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Mon, Jul 01, 2013 at 05:16:25PM +0100, Daniel P. Berrange wrote:
> I'm struggling debugging a strange problem with interaction between user
> namespaces, cap_set and ownership of files in /proc/1/
>
> I'm using a modified version (attached to this mail) of the demo program
> userns_child_exec.c linked on https://lwn.net/Articles/532593/
>
> $ gcc -lcap -Wall -o userns_child_exec userns_child_exec.c
>
> First normal execution appears to work just fine (as root):
>
> $ ./userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash
> Launching child init
> # umount /proc/sys/fs/binfmt_misc
> # umount /proc/sys/fs/binfmt_misc
> # umount /proc/fs/nfsd
> # umount /proc
> # mount -t proc proc /proc/
> # ls -al /proc/1/environ
> -r--------. 1 root root 0 Jul 1 17:04 /proc/1/environ
>
>
> My modification adds support for a '-c' arg to call the program to use
> cap_set() from libcap.so in order to remove the CAP_SYS_MODULE capability.
>
> If I run the program with the '-c' arg present, then the files in
> the /proc/1/ directory all end up owned by nfsnobody.nfsbody
>
> $ ./userns_child_exec -c -p -m -U -M '0 1000 1' -G '0 1000 1' bash
> Launching child init
> # umount /proc/sys/fs/binfmt_misc
> # umount /proc/sys/fs/binfmt_misc
> # umount /proc/fs/nfsd
> # umount /proc
> # mount -t proc proc /proc/
> # ls -al /proc/1/environ
> -r--------. 1 nfsnobody nfsnobody 0 Jul 1 17:01 /proc/1/environ
>
> Why on earth would calling 'cap_set()' to drop a capability cause
> the user/group ownership of files in /proc/1/ to change ?
>
> Any child processes launched from this point get correct ownership
> on their /proc/NNN files - only /proc/1/ seems to be affected.
>
> Via strace, we can see the libcap code only calls 3 syscalls:
>
> capget({_LINUX_CAPABILITY_VERSION_3, 0}, NULL) = 0
> capget({_LINUX_CAPABILITY_VERSION_3, 0}, {CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SET
> UID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MO
> DULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_S
> YS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER
> |CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RA
> W|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_N
> ICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, 0}) = 0
> capset({_LINUX_CAPABILITY_VERSION_3, 0}, {CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_N
ICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP, 0}) = 0
>
> though, for added fun, when running the demo program via strace
> the problem does not appear :-(
>
>
>
> On a slightly related topic, I've noticed that it is not possible to
> invoke prctl(PR_CAPBSET_DROP) to clear the bounding set for processes
> inside a container. The kernel code uses capable() instead of ns_capable().
> Is this intended, or a missing conversion ?
>
> Indeed, even ignoring namespaces for a minute, I'm curious as to why
> CAP_SETPCAP is required at all for PR_CAPBSET_DROP ? Is it really
> a security risk to allow a non-privileged user to remove bits from
> the bounding set ? For KVM I'd like to be able to use PR_CAPBSET_DROP
> to prevent a compromised KVM process from using any setuid program to
> re-gain any kind of capabilities. Similarly I think a container admin
> may well wish to make use of PR_CAPBSET_DROP to lock down applications
> there.
Opps, I should have mentioned that I'm using 3.9.4 kernel. Basically the
Fedora 3.9.4-303 build, but with CONFIG_XFS_FS=n and CONFIG_USER_NS=y
set in the Kconfig.
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
next prev parent reply other threads:[~2013-07-01 16:19 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-01 16:16 Interaction user namespace, /proc/1 ownership & cap_set Daniel P. Berrange
[not found] ` <20130701161625.GQ15954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-01 16:19 ` Daniel P. Berrange [this message]
[not found] ` <20130701161946.GR15954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-01 16:24 ` Richard Weinberger
2013-07-02 5:14 ` Gao feng
[not found] ` <51D261D3.3030002-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-07-02 8:44 ` Eric W. Biederman
[not found] ` <87wqp9uz9a.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 8:56 ` Richard Weinberger
[not found] ` <51D295C5.1080003-/L3Ra7n9ekc@public.gmane.org>
2013-07-02 9:25 ` Daniel P. Berrange
[not found] ` <20130702092554.GD2524-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-02 9:45 ` Richard Weinberger
2013-07-02 9:57 ` Eric W. Biederman
[not found] ` <87ehbhthbl.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 10:07 ` Gao feng
[not found] ` <51D2A649.9030102-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-07-02 16:35 ` Eric W. Biederman
[not found] ` <8761wsudgk.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 16:45 ` Daniel P. Berrange
[not found] ` <20130702164514.GB2524-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-02 17:12 ` Eric W. Biederman
[not found] ` <87k3l8sx6l.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-02 20:24 ` Richard Weinberger
2013-07-09 10:35 ` Richard Weinberger
2013-07-12 10:04 ` Daniel P. Berrange
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130701161946.GR15954@redhat.com \
--to=berrange-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=richard-/L3Ra7n9ekc@public.gmane.org \
--cc=serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.