public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>,
	"Stéphane Graber"
	<stgraber-/Ni+VN9Krahg9hUCZPvPmw@public.gmane.org>,
	"Tim Hockin" <thockin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	"Victor Marmol" <vmarmol-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	"Rohit Jnagal" <jnagal-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	lxc-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org
Subject: Re: [lxc-devel] cgroup management daemon
Date: Tue, 3 Dec 2013 18:03:44 -0600	[thread overview]
Message-ID: <20131204000344.GB24968@sergelap> (raw)
In-Reply-To: <20131203134506.GG8277-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>

Quoting Tejun Heo (tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org):
> Hello, guys.
> 
> Sorry about the delay.
> 
> On Mon, Nov 25, 2013 at 10:43:35PM +0000, Serge E. Hallyn wrote:
> > Additionally, Tejun has specified that we do not want users to be
> > too closely tied to the cgroupfs implementation.  Therefore
> > commands will be just a hair more general than specifying cgroupfs
> > filenames and values.  I may go so far as to avoid specifying
> > specific controllers, as AFAIK there should be no redundancy in
> > features.  On the other hand, I don't want to get too general.
> > So I'm basing the API loosely on the lmctfy command line API.
> 
> One of the reasons for not exposing knobs as-is is that the knobs we
> currently have aren't consistent.  The weight values have different
> ranges, some combinations of values don't make much sense, and so on.
> The user can cope with it but it'd probably be better to expose
> something which doesn't lead to mistakes too easily.

For the moment, for prototype (github.com/hallyn/cgmanager), I'm just
going with filenames/values.

When the bulk of the work is done, we can either (or both) (a) introduce
a thin abstraction layer over the key/values, or/and (b) whitelist
some of the filenames and filter some values.

I know the upstart folks don't want to have to wait long for a
specification...  I'll hopefully make a final decision on this next
week.

> > The above addresses
> >     * creating cgroups
> >     * chowning cgroups
> >     * setting cgroup limits
> >     * moving tasks into cgroups
> >   . but does not address a 'cgexec <group> -- command' type of behavior.
> >     * To handle that (specifically for upstart), recommend that r do:
> >       if (!pid) {
> >         request_reclassify(cgroup, getpid());
> >         do_execve();
> >       }
> >   . alternatively, the daemon could, if kernel is new enough, setns to
> >     the requestor's namespaces to execute a command in a new cgroup.
> >     The new command would be daemonized to that pid namespaces' pid 1.
> 
> So, IIUC, cgroup hierarchy management - creation and removal of
> cgroups and assignments of tasks will go through while configuring
> control knobs will be delegated to the cgroup owner, right?

Not sure what you mean, but I think the answer is no.  Everything
goes through the manager.  The manager doesn't try to enforce that,
but by default the cgroup filesystems will only be mounted in the
manager's private mnt_ns, and containers at least will not be
allowed to mount cgroup fstype.

> Hmmm... the plan is to allow delegating task assignments in the
> sub-hierarchy but require CAP_X for writes to knobs (not reads).  This
> stems from the fact that, especially with unified hierarchy, those
> operations will be cgroup-core proper operations which are gonna be
> relatively safer and that task organizations in the subhierarchy and
> monitoring knobs are likely to be higher frequency operation than
> enabling and configuring controllers.

Should be ok for this.

> As I communicated multiple times before, delegating write access to
> control knobs to untrusted domain has always been a security risk and
> is likely to continue to remain so.  Also, organizationally, a

Then that will need to be address with per-key blacklisting and/or
per-value filtering in the manager.

Which is my way of saying:  can we please have a list of the security
issues so we can handle them?  :)  (I've asked several times before
but haven't seen a list or anyone offering to make one)

> cgroup's control knobs belong to the parent not the cgroup itself.

After thinking awhile I think this makes perfect sense.  I haven't
implemented set_value yet, and when I do I think I'll implement this
guideline.

> That probably is why you were thinking about putting an extra cgroup
> inbetween for isolation, but the root problem there is that those
> knobs belong to the parent, not the directory itself.

Yup.

> Security is in most part logistics - it's about getting all the
> details right, and we don't either design or implement each knob with
> security in mind and DoSing them has always been pretty easy, so I
> don't think delegating write accesses to knobs is a good idea.
> 
> If you, for whatever reason, can trust the delegatee, which I believe
> is the case for google, it's fine.  If you're trying to delegate to a
> container which you don't have any control over, it isn't a good idea.
> 
> Another thing to consider is due to both the fundamental characterics
> of hierarchy and implementation issues, things will become expensive
> if nesting gets beyond several layers (if controllers are enabled,
> that is) and the controllers in general will be implemented and
> optimized with limited level of nesting in mind.  IOW, building, say,
> 8 level deep hierarchy in the host and then doing the same thing
> inside the container with controllers enabled won't make a very happy

Yes, I very much want to avoid that.

> system.  It probably is something to keep in mind when laying out how
> the whole thing eventually would look like.
> 
> > Long-term we will want the cgroup manager to become more intelligent -
> > to place its own limits on clients, to address cpu and device hotplug,
> > etc.  Since we will not be doing that in the first prototype, the daemon
> > will not keep any state about the clients.
> 
> Isn't the above conflicting with chowning control knobs?

Not sure what you mean by this.

To be clear what I'm talking about is having the client be able to say
"grant 50% of cpus", then when more cpus are added, the actual cpuset
gets recalculated.  This may well forever stay outside of the cgmanager
scope.  It may be more appropriate to put that logic into the lmctfy
layer.

thanks,
-serge

  parent reply	other threads:[~2013-12-04  0:03 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-25 22:43 cgroup management daemon Serge E. Hallyn
     [not found] ` <20131125224335.GA15481-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26  0:03   ` [lxc-devel] " Marian Marinov
     [not found]     ` <5293E544.10805-NV7Lj0SOnH0@public.gmane.org>
2013-11-26  0:11       ` Stéphane Graber
2013-11-26  1:35         ` [lxc-devel] " Marian Marinov
     [not found]           ` <5293FADA.8070901-NV7Lj0SOnH0@public.gmane.org>
2013-11-26  1:46             ` Stéphane Graber
2013-11-26  2:18   ` Michael H. Warfield
     [not found]     ` <1385432284.8590.52.camel-s3/A7Nnwjkf10ug9Blv0m0EOCMrvLtNR@public.gmane.org>
2013-11-26  2:43       ` Stéphane Graber
2013-11-26  2:55         ` [lxc-devel] " Michael H. Warfield
2013-11-26  4:52       ` Tim Hockin
     [not found]         ` <CAO_RewYmS0fH819BFCr9ozis1132dACgCPwbyb59gM1PafpUkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-26 16:37           ` Serge E. Hallyn
     [not found]             ` <20131126163737.GB23834-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26 20:49               ` Tim Hockin
2013-11-26  4:58   ` Tim Hockin
     [not found]     ` <CAO_RewZGWARUafKzDc_t3G5OedGtEPTZgB2VYeHHiKSSrja8fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-26  5:47       ` Serge E. Hallyn
     [not found]         ` <20131126054718.GA19134-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26 20:38           ` Tim Hockin
     [not found]             ` <CAO_RewZ8cUn-PdXfQF0yH=V=9UqE7Yo1JX2pt2c71WYDrpYE0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-26 20:58               ` Serge E. Hallyn
     [not found]                 ` <20131126205819.GA27266-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26 21:24                   ` Tim Hockin
     [not found]                     ` <CAO_RewZh+dNkUdZdu-R3CKTvYzbPL50v-BsBHvek75ti3V6kZQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-26 21:28                       ` Serge E. Hallyn
     [not found]                         ` <20131126212814.GA27602-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26 21:31                           ` Victor Marmol
     [not found]                             ` <CAD=mX8uuAeN7s8ZA6Gc-wsBd6-PHevBRyBL6hMAS9VW15T5eYA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-27  1:49                               ` Tim Hockin
     [not found]                                 ` <CAO_RewY0eFTgkVqbRJwdW9bgR3nz9h5t6c823wFH5cg1CD0sEA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-27  1:53                                   ` Serge E. Hallyn
2013-11-26 16:12       ` Serge E. Hallyn
     [not found]         ` <20131126161246.GA23834-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26 16:22           ` Victor Marmol
     [not found]             ` <CAD=mX8tCOEO4wP-XGs9YdRufTAay6zPaOxo_wZF=-KoqepH0wg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-26 16:41               ` Serge E. Hallyn
     [not found]                 ` <20131126164125.GC23834-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-11-26 17:19                   ` Victor Marmol
     [not found]                     ` <CAD=mX8v-jfA8F5DueK60Oo4Zfcjj86idKYKnDVc9LxQVs9W_rQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-03 14:00                       ` Tejun Heo
2013-11-26 20:45           ` Tim Hockin
2013-12-03 13:54       ` Tejun Heo
2013-12-03 13:45   ` Tejun Heo
     [not found]     ` <20131203134506.GG8277-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-12-03 13:45       ` Tejun Heo
2013-12-04  0:03       ` Serge Hallyn [this message]
2013-12-04  1:24         ` [lxc-devel] " Tejun Heo
     [not found]           ` <20131204012416.GY8277-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-12-04  1:26             ` Tejun Heo
2013-12-04  2:31             ` Serge Hallyn
2013-12-04  4:53               ` Tim Hockin
     [not found]                 ` <CAO_RewbZiLCJcO9G7pgxN8ZxkkVdEW1B84nkQ5wX3a9DPq4zfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-04  5:09                   ` Victor Marmol
     [not found]                     ` <CAD=mX8seoMfM63hOwbmJ_0GdS-fa8H6fB40k8uyqBNbSVqfXrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-04 15:56                       ` [lxc-devel] " Serge Hallyn
2013-12-04 11:37                   ` Tejun Heo
2013-12-04 15:54                   ` Serge Hallyn
2013-12-04 23:06                     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131204000344.GB24968@sergelap \
    --to=serge.hallyn-gewih/nmzzlqt0dzr+alfa@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=jnagal-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
    --cc=lxc-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org \
    --cc=stgraber-/Ni+VN9Krahg9hUCZPvPmw@public.gmane.org \
    --cc=thockin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=vmarmol-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox