Linux Container Development
 help / color / mirror / Atom feed
  • [parent not found: <1329917459.24994.14.camel@twins>]
  • [parent not found: <20120222163858.GB4128@redhat.com>]
  • [parent not found: <4F45F742.1060605@cn.fujitsu.com>]
  • [parent not found: <20120221212106.GF12236@google.com>]
  • [parent not found: <20120312221050.GG23255@google.com>]
  • * [RFD] cgroup: about multiple hierarchies
    @ 2012-02-21 21:19 Tejun Heo
      0 siblings, 0 replies; 85+ messages in thread
    From: Tejun Heo @ 2012-02-21 21:19 UTC (permalink / raw)
      To: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
    	cgroups-u79uwXL29TY76Z2rM5mHXA
      Cc: Frederic Weisbecker, Andrew Morton, Kay Sievers,
    	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Lennart Poettering
    
    Hello, guys.
    
    I've been thinking about multiple hierarchy support in cgroup for a
    while, especially after Frederic's pending task counter patchset.
    This is a write up of what I've been thinking.  I don't know what to
    do yet and simply continuing the current situation definitely is an
    option, so please read on and throw in your 20 Won (or whatever amount
    in whatever currency you want).
    
    * The problems.
    
    The support for multiple process hierarchies always struck me as
    rather strange.  If you forget about the current cgroup controllers
    and their implementations, the *only* reason to support multiple
    hierarchies is if you want to apply resource limits based on different
    orthogonal categorizations.
    
    Documentation/cgroups.txt seems to be written with this consideration
    on mind.  It's giving an example of applying limits accoring to two
    orthogonal categorizations - user groups (profressors, students...)
    and applications (WWW, NFS...).  While it may sound like a valid use
    case, I'm very skeptical how useful or common mixing such orthogonal
    categorizations in a single setup would be.
    
    If support for multiple hierarchies comes for free, at least in terms
    of features, maybe it can be better but of course it isn't so.  Any
    given cgroup subsystem (or controller) can only be applied to a single
    hierarchy, which makes sense for a lot of things - what would two
    different limits on the same resource from different hierarchies mean?
    But, there also are things which can be used and useful in all
    hierarchies - e.g. cgroup freezer and task counter.
    
    While the current cgroup implementation and conventions can probably
    allow admins and engineers to tailor cgroup configuration for a
    specific setup, it is very difficult to use in generic and automated
    way.  I mean, who owns the freezer or task counter?  If they're
    mounted on their own hierarchies, how should they be structured?
    Should the different hierarchies be structured such that they are
    projections of one unified hierarchy so that those generic mechanisms
    can be applied uniformly?  If so, why do we need multiple hierarchies
    at all?
    
    A related limitation is that as different subsystems don't know which
    hierarchies they'll end up on, they can't cooperate.  Wouldn't it make
    more sense if task counter is a separate thing watching the resources
    and triggers different actions as conifgured - be it failing forks or
    freezing?
    
    And yet another oddity is how cgroup handles nested cgroups - some
    care about nesting but others just treat both internal and leaf nodes
    equally.  They don't care about the topology at all.  This, too, can
    be fine if you approach things subsys by subsys and use them in
    different ways but if you try to combine them in generic way you get
    sucked into the lala land of whatevers.
    
    The following is a "best practices" document on using cgroups.
    
      http://www.freedesktop.org/wiki/Software/systemd/PaxControlGroups
    
    To me, it seems to demonstrate the rather ugly situation that the
    current cgroup is providing.  Everyone should tip-toe around cgroup
    hierarchies and nobody has full knowledge or control over them.
    e.g. base system management (e.g. systemd) can't use freezer or task
    counter as someone else might want to use it for different hierarchy
    layout.
    
    It seems to me that cgroup interface is too complicated and inflexible
    at the same time to be useful in generic manner.  Sure, it can be
    useful for setups individually crafted by engineers and admins to
    match specific sites or applications but as soon as you try to do
    something automatic and generic with it, there just are too many
    different scenarios and limitations to consider.
    
    
    * So, what to do?
    
    Heh, I don't know.  IIRC, last year at LinuxCon Japan, I heard
    Christoph saying that the biggest problem w/ cgroup was that it was
    building completely separate hierarchies out of the traditional
    process hierarchies.  After thinking about this stuff for a while, I
    fully agree with him.  I think this whole thing should have been a
    layer over the process tree like sessions or program groups.
    
    Unfortunately, that ship sailed long ago and we gotta make do with
    what we have on our collective hands.  Here are some paths that we can
    take.
    
    1. We're screwed anyway.  Just don't worry about it and continue down
       on this path.  Can't get much worse, right?
    
       This approach has the apparent advantage of not having to do
       anything and is probably most likely to be taken.  This isn't ideal
       but hey nothing is. :P
    
    2. Make it more flexible (and likely more complex, unfortunately).
       Allow the utility type subsystems to be used in multiple
       hierarchies.  The easiest and probably dirtiest way to achieve that
       would be embedding them into cgroup core.
    
       Thinking about doing this depresses me and it's not like I have a
       cheerful personality to begin with. :(
    
    3. Head towards single hierarchy with the pie-in-the-sky goal of
       merging things into process hierarchy in some distant future.
    
       The first step would be herding people to use a unified hierarchy
       (ie. all subsystems mounted on a single cgroup tree) which is
       controlled by single entity in userland (be it systemd or cgroupd,
       cgroup-kit or whatever); however, even if we exclude supporting
       orthogonal categorizations, there are good number of non-trivial
       hurdles to clear before this can be realized.
    
       Most importantly, we would need to clean up how nesting is handled
       across different subsystems.  Handling internal and leaf nodes as
       equals simply can't work.  Membership should be recursive, and for
       subsystems which can't support proper nesting, the right thing to
       do would be somehow ensuring that only single node in the path from
       root to leaf is active for the controller.  We may even have to
       introduce an alternative of operation to support this (yuck).
    
       This path would require the most amount of work and we would be
       excluding a feature - support for multiple orthogonal
       categorizations - which has been available till now, probably
       through deprecation process spanning years; however, this at least
       gives us hope that we may reach sanity in the end, how distant that
       end may be.  Oh, hope. :)
    
    So, I mean, I don't know.  What do other people think?  Is this a
    unnecessary worry?  Are people generally happy with the way things
    are?  Lennart, Kay, what do you guys think?
    
    Thanks.
    
    --
    tejun
    
    ^ permalink raw reply	[flat|nested] 85+ messages in thread

    end of thread, other threads:[~2012-03-16 23:14 UTC | newest]
    
    Thread overview: 85+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <20120221211938.GE12236@google.com>
         [not found] ` <20120221211938.GE12236-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-02-21 21:21   ` [RFD] cgroup: about multiple hierarchies Tejun Heo
    2012-02-22 13:30   ` Peter Zijlstra
    2012-02-22 15:45   ` Frederic Weisbecker
         [not found]     ` <20120222154501.GA1693-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
    2012-02-22 18:22       ` Tejun Heo
         [not found]     ` <20120222182207.GC32694@google.com>
         [not found]       ` <20120222182207.GC32694-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-02-27 17:46         ` Frederic Weisbecker
    2012-02-22 16:38   ` Vivek Goyal
    2012-02-23  8:22   ` Li Zefan
    2012-03-03  9:58   ` Eric W. Biederman
         [not found]     ` <m162em2efy.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
    2012-03-03 14:26       ` Serge Hallyn
    2012-03-05 11:37   ` Lennart Poettering
    2012-03-12 22:10   ` Tejun Heo
         [not found] ` <1329917459.24994.14.camel@twins>
    2012-02-22 13:37   ` Glauber Costa
    2012-02-22 18:01   ` Tejun Heo
    2012-02-23  7:39   ` Li Zefan
         [not found] ` <20120222163858.GB4128@redhat.com>
         [not found]   ` <20120222163858.GB4128-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-22 16:57     ` Vivek Goyal
    2012-02-22 18:33     ` Tejun Heo
    2012-02-23  7:59     ` Li Zefan
         [not found]   ` <20120222165714.GC4128@redhat.com>
         [not found]     ` <20120222165714.GC4128-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-22 18:43       ` Tejun Heo
    2012-02-23  9:41       ` Peter Zijlstra
    2012-02-23 14:13         ` Peter Zijlstra
    2012-03-01 17:19           ` Michal Schmidt
         [not found]             ` <4F4FAF89.3090706-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-01 18:03               ` Peter Zijlstra
    2012-03-02 11:08                 ` Michal Schmidt
         [not found]                   ` <4F50AA22.9080007-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-02 11:23                     ` Peter Zijlstra
         [not found]                   ` <1330687394.11248.222.camel@twins>
    2012-03-02 11:28                     ` Michal Schmidt
         [not found]                       ` <4F50AEC3.5090807-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-02 11:34                         ` Peter Zijlstra
    2012-03-01 20:26               ` Mike Galbraith
         [not found]             ` <1330633603.7414.49.camel@marge.simpson.net>
         [not found]               ` <1330633603.7414.49.camel-YqMYhexLQo31wTEvPJ5Q0F6hYfS7NtTn@public.gmane.org>
    2012-03-01 21:02                 ` Vivek Goyal
    2012-03-02  2:43                 ` Kay Sievers
         [not found]                   ` <CAPXgP12_A=uz_p92eBN49DTSKj7iP0rChW9cE81aZKWEjOH5nA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    2012-03-02 10:15                     ` Peter Zijlstra
    2012-03-02 11:16                 ` Michal Schmidt
         [not found]                   ` <4F50ABF2.5070809-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-02 11:24                     ` Peter Zijlstra
         [not found]               ` <20120301210213.GF13533@redhat.com>
         [not found]                 ` <20120301210213.GF13533-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-01 22:04                   ` Mike Galbraith
         [not found]                 ` <1330639448.7414.97.camel@marge.simpson.net>
         [not found]                   ` <1330639448.7414.97.camel-YqMYhexLQo31wTEvPJ5Q0F6hYfS7NtTn@public.gmane.org>
    2012-03-01 22:38                     ` C Anthony Risinger
    2012-03-02 10:51                     ` Michal Schmidt
         [not found]                       ` <4F50A63F.40306-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-02 11:52                         ` Mike Galbraith
    2012-03-05 12:43                     ` Lennart Poettering
         [not found]                       ` <20120305124310.GD10929-kS5D54t9nk0aINubkmmoJbNAH6kLmebB@public.gmane.org>
    2012-03-05 15:47                         ` Mike Galbraith
         [not found]                       ` <1330962421.7368.69.camel@marge.simpson.net>
         [not found]                         ` <1330962421.7368.69.camel-YqMYhexLQo31wTEvPJ5Q0F6hYfS7NtTn@public.gmane.org>
    2012-03-05 19:58                           ` Mike Galbraith
    2012-02-23 21:38         ` Vivek Goyal
         [not found]           ` <20120223213847.GK19691-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-23 22:34             ` Tejun Heo
         [not found]               ` <20120223223457.GJ22536-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-02-28 21:16                 ` Vivek Goyal
         [not found]                   ` <20120228211627.GH9920-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-28 21:21                     ` Peter Zijlstra
    2012-02-28 21:35                       ` Vivek Goyal
         [not found]                         ` <20120228213526.GI9920-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-28 21:43                           ` Peter Zijlstra
    2012-02-28 21:54                             ` Vivek Goyal
         [not found]                               ` <20120228215439.GJ9920-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-28 22:00                                 ` Peter Zijlstra
    2012-02-28 22:31                                   ` Vivek Goyal
    2012-02-28 21:53                           ` Peter Zijlstra
    2012-02-28 22:09                             ` Vivek Goyal
    2012-02-24 11:33             ` Peter Zijlstra
         [not found]   ` <20120222183351.GD32694@google.com>
         [not found]     ` <20120222183351.GD32694-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-02-23 19:41       ` Vivek Goyal
         [not found]         ` <20120223194109.GI19691-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-02-23 22:38           ` Tejun Heo
         [not found]   ` <4F45F1F0.2010102@cn.fujitsu.com>
         [not found]     ` <4F45F1F0.2010102-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
    2012-02-23 20:32       ` Vivek Goyal
         [not found] ` <4F45F742.1060605@cn.fujitsu.com>
         [not found]   ` <4F45F742.1060605-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
    2012-02-23 17:33     ` Tejun Heo
         [not found] ` <20120221212106.GF12236@google.com>
         [not found]   ` <4F44EEE4.2000809@parallels.com>
         [not found]     ` <4F44EEE4.2000809-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
    2012-02-23  7:45       ` Serge E. Hallyn
         [not found]     ` <20120223074526.GA15835@mail.hallyn.com>
         [not found]       ` <20120223074526.GA15835-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
    2012-02-23 17:29         ` Tejun Heo
         [not found]           ` <20120223172915.GC22536-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-02-23 18:47             ` Serge Hallyn
         [not found]   ` <20120221212106.GF12236-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-02-22 13:34     ` Glauber Costa
    2012-02-26  4:59     ` Konstantin Khlebnikov
         [not found] ` <20120312221050.GG23255@google.com>
         [not found]   ` <1331590938.18960.57.camel@twins>
    2012-03-12 22:28     ` Tejun Heo
         [not found]       ` <20120312222817.GI23255-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-03-12 22:31         ` Lennart Poettering
    2012-03-12 22:32         ` Peter Zijlstra
    2012-03-12 22:39           ` Tejun Heo
         [not found]           ` <20120312223944.GJ23255@google.com>
         [not found]             ` <20120312223944.GJ23255-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-03-12 22:44               ` Peter Zijlstra
    2012-03-12 23:04                 ` Tejun Heo
         [not found]                   ` <20120312230416.GM23255-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-03-13 14:10                     ` Vivek Goyal
         [not found]                       ` <20120313141032.GD29169-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-13 16:11                         ` C Anthony Risinger
    2012-03-13 17:25                         ` Peter Zijlstra
    2012-03-13 17:31                           ` Peter Zijlstra
         [not found]                       ` <CAGAVQTGus7LUWV3AdhAFy--gr=uJRWtSGjuP69-EckBiXy0qVg@mail.gmail.com>
         [not found]                         ` <CAGAVQTGus7LUWV3AdhAFy--gr=uJRWtSGjuP69-EckBiXy0qVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    2012-03-13 16:30                           ` C Anthony Risinger
    2012-03-13 10:11                 ` Glauber Costa
    2012-03-13 14:03         ` Vivek Goyal
         [not found]           ` <20120313140345.GC29169-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-13 15:59             ` Tejun Heo
         [not found]           ` <20120313155955.GB7349@google.com>
         [not found]             ` <20120313155955.GB7349-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-03-16 23:14               ` James Bottomley
         [not found]       ` <20120312223113.GB18359@tango.0pointer.de>
         [not found]         ` <20120312223113.GB18359-kS5D54t9nk0aINubkmmoJbNAH6kLmebB@public.gmane.org>
    2012-03-12 23:00           ` Tejun Heo
         [not found]         ` <20120312230020.GL23255@google.com>
         [not found]           ` <20120312230020.GL23255-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-03-12 23:02             ` Peter Zijlstra
    2012-03-12 23:09               ` Tejun Heo
    2012-03-12 23:43               ` Lennart Poettering
         [not found]   ` <20120312223707.GA8272@peqn>
    2012-03-12 22:55     ` Tejun Heo
         [not found]   ` <20120312221050.GG23255-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    2012-03-12 22:22     ` Peter Zijlstra
    2012-03-12 22:37     ` Serge Hallyn
    2012-03-13 13:49     ` Vivek Goyal
         [not found]       ` <20120313134922.GB29169-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    2012-03-13 16:02         ` Tejun Heo
    2012-02-21 21:19 Tejun Heo
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox