cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Aristeu Rozanski <aris-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: Why does devices cgroup check for CAP_SYS_ADMIN explicitly?
Date: Tue, 6 Nov 2012 09:45:34 -0600	[thread overview]
Message-ID: <20121106154534.GA20237@sergelap> (raw)
In-Reply-To: <871ug6rbio.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes:
> 
> > Hello, Serge.
> >
> > On Tue, Nov 06, 2012 at 09:01:32AM -0600, Serge Hallyn wrote:
> >> More practically, lacking user namespaces you can create a full (i.e.
> >> ubuntu) container that doesn't have CAP_SYS_ADMIN, but not one without
> >> root.  So this allows you to prevent containers from bypassing devices
> >> cgroup restrictions set by the parent.  (In reality we are not using
> >> that in ubuntu - we grant CAP_SYS_ADMIN and use apparmor to restrict -
> >> but other distros do.)
> >
> > Do you even mount cgroupfs in containers?  If you just bind-mount
> > cgroupfs verbatim in containers, I don't think that's gonna work very
> > well.  If not, all this doesn't make any difference for containers.
> >
> > So, you don't really have any actual use case for the explicit CAP_*
> > checks, right?
> 
> Having thought about this a little more I can give a definitive answer.
> 
> Adding a process to the device control group is equivalent to calling
> mknod, as it allows that process to open device nodes, or equivalently
> not open device nodes.  Therefore a capable check is absolutely
> required.
> 
> Without a capability check it would be possible to remove access to
> /dev/console for a suid root application keeping it from reporting
> attempts to hack it for example.
> 
> update_access can allow access to previously unaccessible devices
> and so is equivalent to mknod and as such requires a capability call.
> 
> static int devcgroup_update_access(struct dev_cgroup *devcgroup,
> 				   int filetype, const char *buffer)
> {
> ....
> 	if (!capable(CAP_SYS_ADMIN))
> 		return -EPERM;
> 
> 
> Likewise move to a different cgroup can give you a completely different
> set of devices you can use.  And is roughly equivalent to mknod, and
> needs a capability call. 
> 
> static int devcgroup_can_attach(struct cgroup *new_cgrp,
> 				struct cgroup_taskset *set)
> {
> 	struct task_struct *task = cgroup_taskset_first(set);
> 
> 	if (current != task && !capable(CAP_SYS_ADMIN))
> 		return -EPERM;
> 
> 
> The generic cgroup check in attach_task_by_pid to see if you can move
> another process into a cgroup needs to be a capability call and not a
> test for uid == 0.
> 
> static int attach_task_by_pid(struct cgroup *cgrp, u64 pid, bool threadgroup)
> {
> 	if (pid) {
> 		tsk = find_task_by_vpid(pid);
> 
> 		/*
> 		 * even if we're attaching all tasks in the thread group, we
> 		 * only need to check permissions on one of them.
> 		 */
> 		tcred = __task_cred(tsk);
> 		if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) &&
>                                         ^^^^^^^^^^^^^^^
> 		    !uid_eq(cred->euid, tcred->uid) &&
> 		    !uid_eq(cred->euid, tcred->suid)) {
> 			rcu_read_unlock();
> 			ret = -EACCES;
> 			goto out_unlock_cgroup;
> 
> Eric

(full context kept, though long, bc it's all important)

Note that part of the problem is simply that the devices cgroup is serving
as a stand-in for the lack of both user and device namespaces.  If those
both existed, we could get rid of the devices cgroup.  Likewise, the
presence of the devices cgroup makes a device namespace far less
compelling :)  We can play games with bind mounts into /dev and devcgroup
to do most of what we want a devicens for.

-serge

      parent reply	other threads:[~2012-11-06 15:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-06  2:38 Why does devices cgroup check for CAP_SYS_ADMIN explicitly? Tejun Heo
     [not found] ` <20121106023845.GI19354-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 11:58   ` Eric W. Biederman
     [not found]     ` <877gpzrlir.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-06 14:48       ` Tejun Heo
2012-11-06 15:01       ` Serge Hallyn
2012-11-06 15:06         ` Tejun Heo
     [not found]           ` <20121106150639.GB30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 15:30             ` Serge Hallyn
2012-11-06 15:41               ` Tejun Heo
     [not found]                 ` <20121106154105.GD30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 16:12                   ` Aristeu Rozanski
2012-11-06 15:34             ` Eric W. Biederman
     [not found]               ` <871ug6rbio.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-06 15:43                 ` Tejun Heo
     [not found]                   ` <20121106154320.GE30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 16:10                     ` Eric W. Biederman
     [not found]                       ` <87sj8mogpp.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-06 16:52                         ` Tejun Heo
     [not found]                           ` <20121106165246.GF30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 17:31                             ` Serge Hallyn
2012-11-06 17:38                               ` Tejun Heo
     [not found]                                 ` <20121106173823.GK30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 17:41                                   ` Tejun Heo
     [not found]                                     ` <20121106174130.GL30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 18:02                                       ` Serge Hallyn
2012-11-06 18:08                                         ` Tejun Heo
2012-11-06 18:12                                   ` Serge Hallyn
2012-11-06 18:16                                     ` Tejun Heo
     [not found]                                       ` <20121106181623.GO30069-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2012-11-06 18:25                                         ` Serge Hallyn
2012-11-06 15:45                 ` Serge Hallyn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121106154534.GA20237@sergelap \
    --to=serge.hallyn-z7wlfzj8ewms+fvcfc7uqw@public.gmane.org \
    --cc=aris-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).