Linux Container Development
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Linux Containers
	<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Re: [PATCH 4/4] The control group itself
Date: Mon, 11 Feb 2008 11:38:30 -0600	[thread overview]
Message-ID: <20080211173830.GA22160@sergelap.ibm.com> (raw)
In-Reply-To: <47AB013B.8060502-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>

Quoting Pavel Emelyanov (xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org):
> Each new group will have its own maps for char and block
> layers. The devices access list is tuned via the 
> devices.permissions file. One may read from the file to get 
> the configured state.
> 
> The top container isn't initialized, so that the char 
> and block layers will use the global maps to lookup 
> their devices. I did that not to export the static maps
> to the outer world.
> 
> Good news is that this patch now contains more comments 
> and Documentation file :)

Seems to work as advertised  :)  I can't reproduce Suka's null/zero
bug.

You're relying fully on uid-0 to stop writes into the
devices.permissions files.  Would you mind adding a check for
CAP_SYS_ADMIN (or CAP_NS_OVERRIDE+CAP_MKNOD)?  Or were you really
counting on using the filesystem visibility cgroup to stop a container
from making changes to its device access whitelist?

thanks,
-serge

> Signed-off-by: Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> 
> ---
> 
> diff --git a/Documentation/controllers/devices.txt b/Documentation/controllers/devices.txt
> new file mode 100644
> index 0000000..dbd0c7a
> --- /dev/null
> +++ b/Documentation/controllers/devices.txt
> @@ -0,0 +1,61 @@
> +
> +	Devices visibility controller
> +
> +This controller allows to tune the devices accessibility by tasks,
> +i.e. grant full access for /dev/null, /dev/zero etc, grant read-only
> +access to IDE devices and completely hide SCSI disks.
> +
> +Tasks still can call mknod to create device files, regardless of
> +whether the particular device is visible or accessible, but they
> +may not be able to open it later.
> +
> +This one hides under CONFIG_CGROUP_DEVS option.
> +
> +
> +Configuring
> +
> +The controller provides a single file to configure itself -- the
> +devices.permissions one. To change the accessibility level for some
> +device write the following string into it:
> +
> +[cb] <major>:(<minor>|*) [r-][w-]
> + ^          ^               ^
> + |          |               |
> + |          |               +--- access rights (1)
> + |          |
> + |          +-- device major and minor numbers (2)
> + |
> + +-- device type (character / block)
> +
> +1) The access rights set to '--' remove the device from the group's
> +access list, so that it will not even be shown in this file later.
> +
> +2) Setting the minor to '*' grants access to all the minors for
> +particular major.
> +
> +When reading from it, one may see something like
> +
> +	c 1:5 rw
> +	b 8:* r-
> +
> +Security issues, concerning who may grant access to what are governed
> +at the cgroup infrastructure level.
> +
> +
> +Examples:
> +
> +1. Grand full access to /dev/null
> +	# echo c 1:3 rw > /cgroups/<id>/devices.permissions
> +
> +2. Grant the read-only access to /dev/sda and partitions
> +	# echo b 8:* r- > ...
> +
> +3. Change the /dev/null access to write-only
> +	# echo c 1:3 -w > ...
> +
> +4. Revoke access to /dev/sda
> +	# echo b 8:* -- > ...
> +
> +
> +		Written by Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> +
> diff --git a/fs/Makefile b/fs/Makefile
> index 7996220..5ad03be 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -64,6 +64,8 @@ obj-y				+= devpts/
> 
>  obj-$(CONFIG_PROFILING)		+= dcookies.o
>  obj-$(CONFIG_DLM)		+= dlm/
> +
> +obj-$(CONFIG_CGROUP_DEVS)	+= devscontrol.o
>   
>  # Do not add any filesystems before this line
>  obj-$(CONFIG_REISERFS_FS)	+= reiserfs/
> diff --git a/fs/devscontrol.c b/fs/devscontrol.c
> new file mode 100644
> index 0000000..48c5f69
> --- /dev/null
> +++ b/fs/devscontrol.c
> @@ -0,0 +1,314 @@
> +/*
> + * devscontrol.c - Device Controller
> + *
> + * Copyright 2007 OpenVZ SWsoft Inc
> + * Author: Pavel Emelyanov <xemul at openvz dot org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <linux/cgroup.h>
> +#include <linux/cdev.h>
> +#include <linux/err.h>
> +#include <linux/devscontrol.h>
> +#include <linux/uaccess.h>
> +#include <linux/fs.h>
> +#include <linux/genhd.h>
> +
> +struct devs_cgroup {
> +	/*
> +	 * The subsys state to build into cgrous infrastructure
> +	 */
> +	struct cgroup_subsys_state css;
> +
> +	/*
> +	 * The maps of character and block devices. They provide a
> +	 * map from dev_t-s to struct cdev/gendisk. See fs/char_dev.c
> +	 * and block/genhd.c to find out how the ->open() callbacks
> +	 * work when opening a device.
> +	 *
> +	 * Each group will have its onw maps, and at the open()
> +	 * time code will lookup in this map to get the device
> +	 * and permissions by its dev_t.
> +	 */
> +	struct kobj_map *cdev_map;
> +	struct kobj_map *bdev_map;
> +};
> +
> +static inline
> +struct devs_cgroup *css_to_devs(struct cgroup_subsys_state *css)
> +{
> +	return container_of(css, struct devs_cgroup, css);
> +}
> +
> +static inline
> +struct devs_cgroup *cgroup_to_devs(struct cgroup *cont)
> +{
> +	return css_to_devs(cgroup_subsys_state(cont, devs_subsys_id));
> +}
> +
> +struct kobj_map *task_cdev_map(struct task_struct *tsk)
> +{
> +	struct cgroup_subsys_state *css;
> +
> +	css = task_subsys_state(tsk, devs_subsys_id);
> +	if (css->cgroup->parent == NULL)
> +		return NULL;
> +	else
> +		return css_to_devs(css)->cdev_map;
> +}
> +
> +struct kobj_map *task_bdev_map(struct task_struct *tsk)
> +{
> +	struct cgroup_subsys_state *css;
> +
> +	css = task_subsys_state(tsk, devs_subsys_id);
> +	if (css->cgroup->parent == NULL)
> +		return NULL;
> +	else
> +		return css_to_devs(css)->bdev_map;
> +}
> +
> +static struct cgroup_subsys_state *
> +devs_create(struct cgroup_subsys *ss, struct cgroup *cont)
> +{
> +	struct devs_cgroup *devs;
> +
> +	devs = kzalloc(sizeof(struct devs_cgroup), GFP_KERNEL);
> +	if (devs == NULL)
> +		goto out;
> +
> +	devs->cdev_map = cdev_map_init();
> +	if (devs->cdev_map == NULL)
> +		goto out_free;
> +
> +	devs->bdev_map = bdev_map_init();
> +	if (devs->bdev_map == NULL)
> +		goto out_free_cdev;
> +
> +	return &devs->css;
> +
> +out_free_cdev:
> +	cdev_map_fini(devs->cdev_map);
> +out_free:
> +	kfree(devs);
> +out:
> +	return ERR_PTR(-ENOMEM);
> +}
> +
> +static void devs_destroy(struct cgroup_subsys *ss, struct cgroup *cont)
> +{
> +	struct devs_cgroup *devs;
> +
> +	devs = cgroup_to_devs(cont);
> +	bdev_map_fini(devs->bdev_map);
> +	cdev_map_fini(devs->cdev_map);
> +	kfree(devs);
> +}
> +
> +/*
> + * The devices.permissions file read/write functionality
> + *
> + * The following two routines parse and print the strings like
> + * [cb] <major>:(<minor>|*) [r-][w-]
> + */
> +
> +static int decode_perms_str(char *buf, int *chrdev, dev_t *dev,
> +		int *all, mode_t *mode)
> +{
> +	unsigned int major, minor;
> +	char *end;
> +	mode_t tmp;
> +
> +	if (buf[0] == 'c')
> +		*chrdev = 1;
> +	else if (buf[0] == 'b')
> +		*chrdev = 0;
> +	else
> +		return -EINVAL;
> +
> +	if (buf[1] != ' ')
> +		return -EINVAL;
> +
> +	major = simple_strtoul(buf + 2, &end, 10);
> +	if (*end != ':')
> +		return -EINVAL;
> +
> +	if (end[1] == '*') {
> +		if (end[2] != ' ')
> +			return -EINVAL;
> +
> +		*all = 1;
> +		minor = 0;
> +		end += 2;
> +	} else {
> +		minor = simple_strtoul(end + 1, &end, 10);
> +		if (*end != ' ')
> +			return -EINVAL;
> +
> +		*all = 0;
> +	}
> +
> +	tmp = 0;
> +
> +	if (end[1] == 'r')
> +		tmp |= FMODE_READ;
> +	else if (end[1] != '-')
> +		return -EINVAL;
> +	if (end[2] == 'w')
> +		tmp |= FMODE_WRITE;
> +	else if (end[2] != '-')
> +		return -EINVAL;
> +
> +	*dev = MKDEV(major, minor);
> +	*mode = tmp;
> +	return 0;
> +}
> +
> +static int encode_perms_str(char *buf, int len, int chrdev, dev_t dev,
> +		int all, mode_t mode)
> +{
> +	int ret;
> +
> +	ret = snprintf(buf, len, "%c %d:", chrdev ? 'c' : 'b', MAJOR(dev));
> +	if (all)
> +		ret += snprintf(buf + ret, len - ret, "*");
> +	else
> +		ret += snprintf(buf + ret, len - ret, "%d", MINOR(dev));
> +
> +	ret += snprintf(buf + ret, len - ret, " %c%c\n",
> +			(mode & FMODE_READ) ? 'r' : '-',
> +			(mode & FMODE_WRITE) ? 'w' : '-');
> +
> +	return ret + 1;
> +}
> +
> +static ssize_t devs_write(struct cgroup *cont, struct cftype *cft,
> +		struct file *f, const char __user *ubuf,
> +		size_t nbytes, loff_t *pos)
> +{
> +	int err, all, chrdev;
> +	dev_t dev;
> +	char buf[64];
> +	struct devs_cgroup *devs;
> +	mode_t mode;
> +
> +	if (copy_from_user(buf, ubuf, sizeof(buf)))
> +		return -EFAULT;
> +
> +	buf[sizeof(buf) - 1] = 0;
> +	err = decode_perms_str(buf, &chrdev, &dev, &all, &mode);
> +	if (err < 0)
> +		return err;
> +
> +	devs = cgroup_to_devs(cont);
> +
> +	/*
> +	 * No locking here is required - all that we need
> +	 * is provided inside the kobject mapping code
> +	 */
> +
> +	if (mode == 0) {
> +		if (chrdev)
> +			err = cdev_del_from_map(devs->cdev_map, dev, all);
> +		else
> +			err = bdev_del_from_map(devs->bdev_map, dev, all);
> +
> +		if (err < 0)
> +			return err;
> +
> +		css_put(&devs->css);
> +	} else {
> +		if (chrdev)
> +			err = cdev_add_to_map(devs->cdev_map, dev, all, mode);
> +		else
> +			err = bdev_add_to_map(devs->bdev_map, dev, all, mode);
> +
> +		if (err < 0)
> +			return err;
> +
> +		css_get(&devs->css);
> +	}
> +
> +	return nbytes;
> +}
> +
> +struct devs_dump_arg {
> +	char *buf;
> +	int pos;
> +	int chrdev;
> +};
> +
> +static int devs_dump_one(dev_t dev, int range, mode_t mode, void *x)
> +{
> +	struct devs_dump_arg *arg = x;
> +	char tmp[64];
> +	int len;
> +
> +	len = encode_perms_str(tmp, sizeof(tmp), arg->chrdev, dev,
> +			range != 1, mode);
> +
> +	if (arg->pos >= PAGE_SIZE - len)
> +		return 1;
> +
> +	memcpy(arg->buf + arg->pos, tmp, len);
> +	arg->pos += len;
> +	return 0;
> +}
> +
> +static ssize_t devs_read(struct cgroup *cont, struct cftype *cft,
> +		struct file *f, char __user *ubuf, size_t nbytes, loff_t *pos)
> +{
> +	struct devs_dump_arg arg;
> +	struct devs_cgroup *devs;
> +	ssize_t ret;
> +
> +	arg.buf = (char *)__get_free_page(GFP_KERNEL);
> +	if (arg.buf == NULL)
> +		return -ENOMEM;
> +
> +	devs = cgroup_to_devs(cont);
> +	arg.pos = 0;
> +
> +	arg.chrdev = 1;
> +	cdev_iterate_map(devs->cdev_map, devs_dump_one, &arg);
> +
> +	arg.chrdev = 0;
> +	bdev_iterate_map(devs->bdev_map, devs_dump_one, &arg);
> +
> +	ret = simple_read_from_buffer(ubuf, nbytes, pos,
> +			arg.buf, arg.pos);
> +
> +	free_page((unsigned long)arg.buf);
> +	return ret;
> +}
> +
> +static struct cftype devs_files[] = {
> +	{
> +		.name = "permissions",
> +		.write = devs_write,
> +		.read = devs_read,
> +	},
> +};
> +
> +static int devs_populate(struct cgroup_subsys *ss, struct cgroup *cont)
> +{
> +	return cgroup_add_files(cont, ss,
> +			devs_files, ARRAY_SIZE(devs_files));
> +}
> +
> +struct cgroup_subsys devs_subsys = {
> +	.name = "devices",
> +	.subsys_id = devs_subsys_id,
> +	.create = devs_create,
> +	.destroy = devs_destroy,
> +	.populate = devs_populate,
> +};
> diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
> index 228235c..9c0cd2c 100644
> --- a/include/linux/cgroup_subsys.h
> +++ b/include/linux/cgroup_subsys.h
> @@ -42,3 +42,9 @@ SUBSYS(mem_cgroup)
>  #endif
> 
>  /* */
> +
> +#ifdef CONFIG_CGROUP_DEVS
> +SUBSYS(devs)
> +#endif
> +
> +/* */
> diff --git a/include/linux/devscontrol.h b/include/linux/devscontrol.h
> new file mode 100644
> index 0000000..38057b9
> --- /dev/null
> +++ b/include/linux/devscontrol.h
> @@ -0,0 +1,26 @@
> +#ifndef __DEVS_CONTROL_H__
> +#define __DEVS_CONTROL_H__
> +struct kobj_map;
> +struct task_struct;
> +
> +/*
> + * task_[cb]dev_map - get a map from task. Both calls may return
> + * NULL, to indicate, that task doesn't belong to any group and
> + * that the global map is to be used.
> + */
> +
> +#ifdef CONFIG_CGROUP_DEVS
> +struct kobj_map *task_cdev_map(struct task_struct *);
> +struct kobj_map *task_bdev_map(struct task_struct *);
> +#else
> +static inline kobj_map *task_cdev_map(struct task_struct *tsk)
> +{
> +	return NULL;
> +}
> +
> +static inline kobj_map *task_bdev_map(struct task_struct *tsk)
> +{
> +	return NULL;
> +}
> +#endif
> +#endif
> diff --git a/init/Kconfig b/init/Kconfig
> index 732a1c2..f9a1b4f 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -283,6 +283,19 @@ config CGROUP_DEBUG
> 
>  	  Say N if unsure
> 
> +config CGROUP_DEVS
> +	bool "Devices cgroup subsystem"
> +	depends on CGROUPS
> +	help
> +	  Controlls the access rights to devices, i.e. you may hide
> +	  some of them from tasks, so that they will not be able
> +	  to open them, or you may grant a read-only access to some
> +	  of them.
> +
> +	  See Documentation/controllers/devices.txt for details.
> +
> +	  This is harmless to say N here, so do it if unsure.
> +
>  config CGROUP_NS
>          bool "Namespace cgroup subsystem"
>          depends on CGROUPS

  parent reply	other threads:[~2008-02-11 17:38 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-07 12:56 [PATCH 0/4] Devices accessibility control group (v3, release candidate) Pavel Emelyanov
     [not found] ` <47AAFFF2.9030804-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-02-07 12:57   ` [PATCH 1/4] Some changes in the kobject mapper Pavel Emelyanov
2008-02-07 12:59   ` [PATCH 2/4] The character devices layer changes Pavel Emelyanov
2008-02-07 12:59   ` [PATCH 3/4] The block " Pavel Emelyanov
2008-02-07 13:01   ` [PATCH 4/4] The control group itself Pavel Emelyanov
     [not found]     ` <47AB013B.8060502-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-02-11 17:38       ` Serge E. Hallyn [this message]
     [not found]         ` <20080211173830.GA22160-6s5zFf/epYL1ENwx4SLHqw@public.gmane.org>
2008-02-12 10:28           ` Pavel Emelyanov
     [not found]             ` <47B174B2.5010500-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-02-12 17:21               ` Serge E. Hallyn
     [not found]                 ` <20080212172134.GA12177-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2008-02-13  2:17                   ` Paul Menage
     [not found]                     ` <6599ad830802121817n7713fa85h51aedf4df74aa764-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-02-13  2:42                       ` Paul Jackson
     [not found]                         ` <20080212204215.2eca689f.pj-sJ/iWh9BUns@public.gmane.org>
2008-02-14 17:17                           ` Serge E. Hallyn
2008-02-13  2:32                   ` Paul Jackson
     [not found]                     ` <20080212203215.fb636900.pj-sJ/iWh9BUns@public.gmane.org>
2008-02-14 17:18                       ` Serge E. Hallyn
2008-02-12  7:42       ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
     [not found]         ` <20080212074217.GA15992-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-02-12  7:51           ` Pavel Emelyanov
2008-02-21 20:47       ` Paul Menage
     [not found]         ` <6599ad830802211247t21fdc4e4hfe637fcffd98ded7-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-02-22  8:12           ` Pavel Emelyanov
     [not found]             ` <47BE83FD.7060908-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-02-23 23:12               ` Paul Menage
     [not found]                 ` <6599ad830802231512t20343cabq738df3039c8a1d1f-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-02-26  7:54                   ` Pavel Emelyanov
2008-02-08 16:12   ` [PATCH 0/4] Devices accessibility control group (v3, release candidate) Serge E. Hallyn
  -- strict thread matches above, loose matches on Subject: below --
2008-01-08  9:02 [PATCH 0/4] Devices accessibility control group (v2) Pavel Emelyanov
     [not found] ` <47833C3A.8090106-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-01-08  9:18   ` [PATCH 4/4] The control group itself Pavel Emelyanov
     [not found]     ` <47833FF6.6060901-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-01-14 17:40       ` Serge E. Hallyn
     [not found]         ` <20080114174056.GB15077-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2008-01-15  7:53           ` Pavel Emelyanov
     [not found]             ` <478C6669.7070705-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-01-15 14:44               ` Serge E. Hallyn
     [not found]                 ` <20080115144440.GE4453-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2008-01-15 16:13                   ` Paul Menage
     [not found]                     ` <6599ad830801150813s6a5a7374qd25b6d6206d5896a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-01-15 17:49                       ` Serge E. Hallyn
     [not found]                         ` <20080115174941.GA11638-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2008-01-15 17:54                           ` Paul Menage
     [not found]                             ` <6599ad830801150954w7e1b6db0p4dd737730f407348-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-01-15 18:17                               ` Serge E. Hallyn
2008-01-14 21:54       ` Paul Menage
     [not found]         ` <6599ad830801141354p5b165cdao8d6184adb9ab61b6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-01-15  7:58           ` Pavel Emelyanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080211173830.GA22160@sergelap.ibm.com \
    --to=serue-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox