* [PATCH] control groups: documentation improvements @ 2014-03-10 11:39 Glyn Normington 2014-03-10 14:07 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-03-10 11:39 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel, gnormington From: Glyn Normington <gnormington@gopivotal.com> Various clarifications to make the control groups documentation easier to understand, especially for newcomers. Delete the phrase "set of parameters" which obfuscates the definition of cgroup. Crisp up the definition of subsystem. Explain the term "partition". Clarify that each hierarchy must be associated with at least one subsystem. Describe the representation of the cgroup virtual filesystem, since this is not specifically described later in the document. Clarify that subsystems may be attached to multiple hierarchies, although this isn't very useful, and explain what happens. Use the term "task" in preference to "process" everywhere, for consistency. The following formal model of control groups helped in producing the above clarifications: https://github.com/Zteve/container-specs/raw/master/cgroups/cgroups.pdf Related LKML thread: http://lkml.iu.edu//hypermail/linux/kernel/1402.0/02419.html Signed-off-by: Glyn Normington <gnormington@gopivotal.com> --- Kernel version: Linux 3.14-rc5. diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index 821de56..003330a 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -43,24 +43,29 @@ specialized behaviour. Definitions: -A *cgroup* associates a set of tasks with a set of parameters for one -or more subsystems. +A *cgroup* associates a set of tasks with one or more subsystems. -A *subsystem* is a module that makes use of the task grouping -facilities provided by cgroups to treat groups of tasks in -particular ways. A subsystem is typically a "resource controller" that +A *subsystem* is a module that treats the tasks of each cgroup in a +particular way. A subsystem is typically a "resource controller" that schedules a resource or applies per-cgroup limits, but it may be -anything that wants to act on a group of processes, e.g. a -virtualization subsystem. +anything that wants to act on a group of tasks, e.g. a virtualization +subsystem. -A *hierarchy* is a set of cgroups arranged in a tree, such that -every task in the system is in exactly one of the cgroups in the -hierarchy, and a set of subsystems; each subsystem has system-specific -state attached to each cgroup in the hierarchy. Each hierarchy has -an instance of the cgroup virtual filesystem associated with it. +A *hierarchy* is a non-empty set of cgroups arranged in a tree and a +non-empty set of subsystems such that the cgroups in the hierarchy +partition all the tasks in the system (in other words, every task in the +system is in exactly one of the cgroups in the hierarchy) and each +subsystem attaches its own state to each cgroup in the hierarchy. -At any one time there may be multiple active hierarchies of task -cgroups. Each hierarchy is a partition of all tasks in the system. +There may be zero or more active hierarchies. Each hierarchy has an +instance of the cgroup virtual filesystem associated with it. The tree +of cgroups is represented by the directory tree in the cgroup virtual +filesystem. + +The sets of subsystems participating in distinct hierarchies are either +identical or disjoint. If the sets are identical, the virtual filesystems +associated with the hierarchies have identical content and a change in +one is automatically reflected in all the others. User-level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to @@ -69,9 +74,9 @@ a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system. On their own, the only use for cgroups is for simple job -tracking. The intention is that other subsystems hook into the generic +tracking. The intention is that subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as -accounting/limiting the resources which processes in a cgroup can +accounting/limiting the resources which tasks in a cgroup can access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allow you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup. @@ -79,12 +84,12 @@ tasks in each cgroup. 1.2 Why are cgroups needed ? ---------------------------- -There are multiple efforts to provide process aggregations in the +There are multiple efforts to provide task aggregations in the Linux kernel, mainly for resource-tracking purposes. Such efforts include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server namespaces. These all require the basic notion of a -grouping/partitioning of processes, with newly forked processes ending -up in the same group (cgroup) as their parent process. +grouping/partitioning of tasks, with newly forked tasks ending +up in the same group (cgroup) as their parent task. The kernel cgroup patch provides the minimum essential kernel mechanisms required to efficiently implement such groups. It has @@ -418,11 +423,11 @@ To remove a cgroup, just use rmdir: # rmdir my_sub_cs This will fail if the cgroup is in use (has cgroups inside, or -has processes attached, or is held alive by other subsystem-specific +has tasks attached, or is held alive by other subsystem-specific reference). -2.2 Attaching processes ------------------------ +2.2 Attaching tasks +------------------- # /bin/echo PID > tasks @@ -450,7 +455,7 @@ move it into a new cgroup (possibly the root cgroup) by writing to the new cgroup's tasks file. Note: Due to some restrictions enforced by some cgroup subsystems, moving -a process to another cgroup can fail. +a task to another cgroup can fail. 2.3 Mounting hierarchies by name -------------------------------- @@ -658,7 +663,7 @@ A: bash's builtin 'echo' command does not check calls to write() against errors. If you use it in the cgroup file system, you won't be able to tell whether a command succeeded or failed. -Q: When I attach processes, only the first of the line gets really attached ! +Q: When I attach tasks, only the first of the line gets really attached ! A: We can only return one error code per call to write(). So you should also put only ONE PID. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-10 11:39 [PATCH] control groups: documentation improvements Glyn Normington @ 2014-03-10 14:07 ` Tejun Heo 2014-03-10 14:17 ` Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Tejun Heo @ 2014-03-10 14:07 UTC (permalink / raw) To: Glyn Normington; +Cc: linux-kernel, Li Zefan Hello, Glyn. On Mon, Mar 10, 2014 at 11:39:28AM +0000, Glyn Normington wrote: > Clarify that each hierarchy must be associated with at least one > subsystem. Hmmm... but named hierarchies can exist without any controllers attached to them. > Clarify that subsystems may be attached to multiple hierarchies, > although this isn't very useful, and explain what happens. And a subsystem may only be attached to a single hierarchy. Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-10 14:07 ` Tejun Heo @ 2014-03-10 14:17 ` Glyn Normington 2014-03-10 14:20 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-03-10 14:17 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel, Li Zefan Hi Tejun Thanks for your quick reply. Responses inline. Regards, Glyn On 10/03/2014 14:07, Tejun Heo wrote: > Hello, Glyn. > > On Mon, Mar 10, 2014 at 11:39:28AM +0000, Glyn Normington wrote: >> Clarify that each hierarchy must be associated with at least one >> subsystem. > Hmmm... but named hierarchies can exist without any controllers > attached to them. Then we missed how to create a hierarchy with no associated subsystems. The only way I can think of is to use mount, specify no subsystems on -o (which defaults to all the subsystems defined in the kernel), and run it in a kernel with no subsystems defined (which seems unlikely these days). Is that what you had in mind or is there some other way of creating a hierarchy with no subsystems attached? > >> Clarify that subsystems may be attached to multiple hierarchies, >> although this isn't very useful, and explain what happens. > And a subsystem may only be attached to a single hierarchy. Perhaps that's what should happen, but the following experiment demonstrates a subsystem being attached to two hierarchies: $ pwd /home/vagrant $ mkdir mem1 $ mkdir mem2 $ sudo su # mount -t cgroup -o memory none /home/vagrant/mem1 # mount -t cgroup -o memory none /home/vagrant/mem2 # cd mem1 # mkdir inst1 # ls inst1 cgroup.clone_children memory.failcnt ... # ls ../mem2 cgroup.clone_children inst1 memory.limit_in_bytes ... # cd inst1 # echo 1000000 > memory.limit_in_bytes # cat memory.limit_in_bytes 1003520 # cat ../../mem2/inst1/memory.limit_in_bytes 1003520 # echo $$ > tasks # cat tasks 1365 1409 # cat ../../mem2/inst1/tasks 1365 1411 > > Thanks. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-10 14:17 ` Glyn Normington @ 2014-03-10 14:20 ` Tejun Heo 2014-03-13 16:04 ` Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Tejun Heo @ 2014-03-10 14:20 UTC (permalink / raw) To: Glyn Normington; +Cc: linux-kernel, Li Zefan Hey, On Mon, Mar 10, 2014 at 02:17:21PM +0000, Glyn Normington wrote: > Then we missed how to create a hierarchy with no associated > subsystems. The only way I can think of is to use mount, specify no > subsystems on -o (which defaults to all the subsystems defined in > the kernel), and run it in a kernel with no subsystems defined > (which seems unlikely these days). > > Is that what you had in mind or is there some other way of creating > a hierarchy with no subsystems attached? Hierarchy name should be specified "-o name=" for hierarchies w/o any controllers. > >>Clarify that subsystems may be attached to multiple hierarchies, > >>although this isn't very useful, and explain what happens. > >And a subsystem may only be attached to a single hierarchy. > > Perhaps that's what should happen, but the following experiment > demonstrates a subsystem being attached to two hierarchies: > > $ pwd > /home/vagrant > $ mkdir mem1 > $ mkdir mem2 > $ sudo su > # mount -t cgroup -o memory none /home/vagrant/mem1 > # mount -t cgroup -o memory none /home/vagrant/mem2 > # cd mem1 > # mkdir inst1 > # ls inst1 > cgroup.clone_children memory.failcnt ... > # ls ../mem2 > cgroup.clone_children inst1 memory.limit_in_bytes ... > # cd inst1 > # echo 1000000 > memory.limit_in_bytes > # cat memory.limit_in_bytes > 1003520 > # cat ../../mem2/inst1/memory.limit_in_bytes > 1003520 > # echo $$ > tasks > # cat tasks > 1365 > 1409 > # cat ../../mem2/inst1/tasks > 1365 > 1411 You're mounting the same hierarchy twice. Those are two views into the same hierarchy. Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-10 14:20 ` Tejun Heo @ 2014-03-13 16:04 ` Glyn Normington 2014-03-14 1:33 ` Li Zefan 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-03-13 16:04 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel, Li Zefan, Steve Powell Hi Tejun Stepping back from the patch for a while, we'd like to explore the issues you raise. Please bear with us as we try to capture the ideas precisely. Continued inline... Regards, Glyn (& Steve Powell, copied) On 10/03/2014 14:20, Tejun Heo wrote: > Hey, > > On Mon, Mar 10, 2014 at 02:17:21PM +0000, Glyn Normington wrote: >> Then we missed how to create a hierarchy with no associated >> subsystems. The only way I can think of is to use mount, specify no >> subsystems on -o (which defaults to all the subsystems defined in >> the kernel), and run it in a kernel with no subsystems defined >> (which seems unlikely these days). >> >> Is that what you had in mind or is there some other way of creating >> a hierarchy with no subsystems attached? > Hierarchy name should be specified "-o name=" for hierarchies w/o any > controllers. According to cgroups.txt: When passing a name=<x> option for a new hierarchy, you need to specify subsystems manually; the legacy behaviour of mounting all subsystems when none are explicitly specified is not supported when you give a subsystem a name. So the documentation certainly does not make it clear that it is valid to specify no subsystems. We tried this (on a 3.11 kernel) and can't get it to work: # mount -t cgroup -o name=th none /home/glyn/h1 mount: wrong fs type, bad option, bad superblock on none, missing codepage or helper program, or other error (for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program) In some cases useful info is found in syslog - try dmesg | tail or so Please could you supply an example which works? >>>> Clarify that subsystems may be attached to multiple hierarchies, >>>> although this isn't very useful, and explain what happens. >>> And a subsystem may only be attached to a single hierarchy. >> Perhaps that's what should happen, but the following experiment >> demonstrates a subsystem being attached to two hierarchies: >> >> $ pwd >> /home/vagrant >> $ mkdir mem1 >> $ mkdir mem2 >> $ sudo su >> # mount -t cgroup -o memory none /home/vagrant/mem1 >> # mount -t cgroup -o memory none /home/vagrant/mem2 >> # cd mem1 >> # mkdir inst1 >> # ls inst1 >> cgroup.clone_children memory.failcnt ... >> # ls ../mem2 >> cgroup.clone_children inst1 memory.limit_in_bytes ... >> # cd inst1 >> # echo 1000000 > memory.limit_in_bytes >> # cat memory.limit_in_bytes >> 1003520 >> # cat ../../mem2/inst1/memory.limit_in_bytes >> 1003520 >> # echo $$ > tasks >> # cat tasks >> 1365 >> 1409 >> # cat ../../mem2/inst1/tasks >> 1365 >> 1411 > You're mounting the same hierarchy twice. Those are two views into > the same hierarchy. > > Thanks. > Yes, it does appear that this is what is going on, but to explain it this way turns out to be more complicated than one might expect. Here's an attempt... --- A*hierarchy* is a non-empty set of cgroups arranged in a tree and a non-empty set of subsystems such that the cgroups in the hierarchy partition all the tasks in the system (in other words, every task in the system is in exactly one of the cgroups in the hierarchy) and each subsystem attaches its own state to each cgroup in the hierarchy. There may be zero or more active hierarchies. Each hierarchy has one or more views associated with it. A *view* is an instance of the cgroup virtual filesystem. The tree of cgroups of a hierarchy is represented by the directory tree in the cgroup virtual filesystem of each view of the hierarchy. All the directory trees in the cgroup virtual filesystem of the views of a given hierarchy have identical content. No subsystem may participate in more than one hierarchy. --- We'd also need to explain the behaviour of mount and umount with respect to views. The first time a cgroup mount is performed with a given set of subsystems, a hierarchy is created and a view of the hierarchy is created and associated with a cgroup filesystem at the mount point. Subsequently, if another cgroup mount is performed with the same set of subsystems, no new hierarchy is created but a new view of the existing hierarchy is created and associated with the cgroup filesystem at the new mount point. Unmounting a cgroup mount point destroys a particular view and destroys the hierarchy associated with the view if and only if the view is the only (remaining) view of the hierarchy. So does introducing the concept of a view really help? The wording in the patch does without the concept by allowing two "hierarchies" to have identical content (so they are indistinguishable). "views" don't seem to have any practical benefit. If we can avoid explaining concepts which have no use, we probably should. ;-) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-13 16:04 ` Glyn Normington @ 2014-03-14 1:33 ` Li Zefan 2014-03-14 13:30 ` Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Li Zefan @ 2014-03-14 1:33 UTC (permalink / raw) To: Glyn Normington; +Cc: Tejun Heo, linux-kernel, Steve Powell On 2014/3/14 0:04, Glyn Normington wrote: > Hi Tejun > > Stepping back from the patch for a while, we'd like to explore the issues you raise. Please bear with us as we try to capture the ideas precisely. > > Continued inline... > > Regards, > Glyn (& Steve Powell, copied) > > On 10/03/2014 14:20, Tejun Heo wrote: >> Hey, >> >> On Mon, Mar 10, 2014 at 02:17:21PM +0000, Glyn Normington wrote: >>> Then we missed how to create a hierarchy with no associated >>> subsystems. The only way I can think of is to use mount, specify no >>> subsystems on -o (which defaults to all the subsystems defined in >>> the kernel), and run it in a kernel with no subsystems defined >>> (which seems unlikely these days). >>> >>> Is that what you had in mind or is there some other way of creating >>> a hierarchy with no subsystems attached? >> Hierarchy name should be specified "-o name=" for hierarchies w/o any >> controllers. > > According to cgroups.txt: > > When passing a name=<x> option for a new hierarchy, you need to > specify subsystems manually; the legacy behaviour of mounting all > subsystems when none are explicitly specified is not supported when > you give a subsystem a name. > > So the documentation certainly does not make it clear that it is valid to specify no subsystems. > > We tried this (on a 3.11 kernel) and can't get it to work: > > # mount -t cgroup -o name=th none /home/glyn/h1 # mount -t cgroup -o name=th,none none /home/glyn/h1 and then # mount -t cgroup -o name=th none /home/glyn/h2 > > mount: wrong fs type, bad option, bad superblock on none, > missing codepage or helper program, or other error > (for several filesystems (e.g. nfs, cifs) you might > need a /sbin/mount.<type> helper program) In some > cases useful info is found in syslog - try dmesg | > tail or so > > Please could you supply an example which works? > >>>>> Clarify that subsystems may be attached to multiple hierarchies, >>>>> although this isn't very useful, and explain what happens. >>>> And a subsystem may only be attached to a single hierarchy. >>> Perhaps that's what should happen, but the following experiment >>> demonstrates a subsystem being attached to two hierarchies: >>> >>> $ pwd >>> /home/vagrant >>> $ mkdir mem1 >>> $ mkdir mem2 >>> $ sudo su >>> # mount -t cgroup -o memory none /home/vagrant/mem1 >>> # mount -t cgroup -o memory none /home/vagrant/mem2 >>> # cd mem1 >>> # mkdir inst1 >>> # ls inst1 >>> cgroup.clone_children memory.failcnt ... >>> # ls ../mem2 >>> cgroup.clone_children inst1 memory.limit_in_bytes ... >>> # cd inst1 >>> # echo 1000000 > memory.limit_in_bytes >>> # cat memory.limit_in_bytes >>> 1003520 >>> # cat ../../mem2/inst1/memory.limit_in_bytes >>> 1003520 >>> # echo $$ > tasks >>> # cat tasks >>> 1365 >>> 1409 >>> # cat ../../mem2/inst1/tasks >>> 1365 >>> 1411 >> You're mounting the same hierarchy twice. Those are two views into >> the same hierarchy. >> >> Thanks. >> > Yes, it does appear that this is what is going on, but to explain it this way turns out to be more complicated than one might expect. Here's an attempt... > Yeah, this can only confuse people. I don't think we need extra explanation, because we all know the same filesystem can have more than one mount point, and cgroupfs is no different. > --- > A*hierarchy* is a non-empty set of cgroups arranged in a tree and a > non-empty set of subsystems such that the cgroups in the hierarchy > partition all the tasks in the system (in other words, every task in the > system is in exactly one of the cgroups in the hierarchy) and each > subsystem attaches its own state to each cgroup in the hierarchy. > > There may be zero or more active hierarchies. Each hierarchy has one > or more views associated with it. A *view* is an instance of the cgroup > virtual filesystem. The tree of cgroups of a hierarchy is represented > by the directory tree in the cgroup virtual filesystem of each view of > the hierarchy. All the directory trees in the cgroup virtual filesystem > of the views of a given hierarchy have identical content. > > No subsystem may participate in more than one hierarchy. > --- > > We'd also need to explain the behaviour of mount and umount with respect to views. The first time a cgroup mount is performed with a given set of subsystems, a hierarchy is created and a view of the hierarchy is created and associated with a cgroup filesystem at the mount point. Subsequently, if another cgroup mount is performed with the same set of subsystems, no new hierarchy is created but a new view of the existing hierarchy is created and associated with the cgroup filesystem at the new mount point. > > Unmounting a cgroup mount point destroys a particular view and destroys the hierarchy associated with the view if and only if the view is the only (remaining) view of the hierarchy. > > So does introducing the concept of a view really help? The wording in the patch does without the concept by allowing two "hierarchies" to have identical content (so they are indistinguishable). > > "views" don't seem to have any practical benefit. If we can avoid explaining concepts which have no use, we probably should. ;-) > > > . > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-14 1:33 ` Li Zefan @ 2014-03-14 13:30 ` Glyn Normington 2014-03-14 14:01 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-03-14 13:30 UTC (permalink / raw) To: Li Zefan; +Cc: Tejun Heo, linux-kernel, Steve Powell Hi Li Thanks! Inline... Tejun: Li agrees with our reasoning on the "views" issue, but we'd like your feedback too in preparation for a new patch. Regards, Glyn & Steve On 14/03/2014 01:33, Li Zefan wrote: > On 2014/3/14 0:04, Glyn Normington wrote: >> Hi Tejun >> >> Stepping back from the patch for a while, we'd like to explore the issues you raise. Please bear with us as we try to capture the ideas precisely. >> >> Continued inline... >> >> Regards, >> Glyn (& Steve Powell, copied) >> >> On 10/03/2014 14:20, Tejun Heo wrote: >>> Hey, >>> >>> On Mon, Mar 10, 2014 at 02:17:21PM +0000, Glyn Normington wrote: >>>> Then we missed how to create a hierarchy with no associated >>>> subsystems. The only way I can think of is to use mount, specify no >>>> subsystems on -o (which defaults to all the subsystems defined in >>>> the kernel), and run it in a kernel with no subsystems defined >>>> (which seems unlikely these days). >>>> >>>> Is that what you had in mind or is there some other way of creating >>>> a hierarchy with no subsystems attached? >>> Hierarchy name should be specified "-o name=" for hierarchies w/o any >>> controllers. >> According to cgroups.txt: >> >> When passing a name=<x> option for a new hierarchy, you need to >> specify subsystems manually; the legacy behaviour of mounting all >> subsystems when none are explicitly specified is not supported when >> you give a subsystem a name. >> >> So the documentation certainly does not make it clear that it is valid to specify no subsystems. >> >> We tried this (on a 3.11 kernel) and can't get it to work: >> >> # mount -t cgroup -o name=th none /home/glyn/h1 > # mount -t cgroup -o name=th,none none /home/glyn/h1 Great! We'll make sure to document this when we submit the next version of the patch. > > and then > > # mount -t cgroup -o name=th none /home/glyn/h2 > >> mount: wrong fs type, bad option, bad superblock on none, >> missing codepage or helper program, or other error >> (for several filesystems (e.g. nfs, cifs) you might >> need a /sbin/mount.<type> helper program) In some >> cases useful info is found in syslog - try dmesg | >> tail or so >> >> Please could you supply an example which works? >> >>>>>> Clarify that subsystems may be attached to multiple hierarchies, >>>>>> although this isn't very useful, and explain what happens. >>>>> And a subsystem may only be attached to a single hierarchy. >>>> Perhaps that's what should happen, but the following experiment >>>> demonstrates a subsystem being attached to two hierarchies: >>>> >>>> $ pwd >>>> /home/vagrant >>>> $ mkdir mem1 >>>> $ mkdir mem2 >>>> $ sudo su >>>> # mount -t cgroup -o memory none /home/vagrant/mem1 >>>> # mount -t cgroup -o memory none /home/vagrant/mem2 >>>> # cd mem1 >>>> # mkdir inst1 >>>> # ls inst1 >>>> cgroup.clone_children memory.failcnt ... >>>> # ls ../mem2 >>>> cgroup.clone_children inst1 memory.limit_in_bytes ... >>>> # cd inst1 >>>> # echo 1000000 > memory.limit_in_bytes >>>> # cat memory.limit_in_bytes >>>> 1003520 >>>> # cat ../../mem2/inst1/memory.limit_in_bytes >>>> 1003520 >>>> # echo $$ > tasks >>>> # cat tasks >>>> 1365 >>>> 1409 >>>> # cat ../../mem2/inst1/tasks >>>> 1365 >>>> 1411 >>> You're mounting the same hierarchy twice. Those are two views into >>> the same hierarchy. >>> >>> Thanks. >>> >> Yes, it does appear that this is what is going on, but to explain it this way turns out to be more complicated than one might expect. Here's an attempt... >> > Yeah, this can only confuse people. > > I don't think we need extra explanation, because we all know the same > filesystem can have more than one mount point, and cgroupfs is no > different. Thanks. > >> --- >> A*hierarchy* is a non-empty set of cgroups arranged in a tree and a >> non-empty set of subsystems such that the cgroups in the hierarchy >> partition all the tasks in the system (in other words, every task in the >> system is in exactly one of the cgroups in the hierarchy) and each >> subsystem attaches its own state to each cgroup in the hierarchy. >> >> There may be zero or more active hierarchies. Each hierarchy has one >> or more views associated with it. A *view* is an instance of the cgroup >> virtual filesystem. The tree of cgroups of a hierarchy is represented >> by the directory tree in the cgroup virtual filesystem of each view of >> the hierarchy. All the directory trees in the cgroup virtual filesystem >> of the views of a given hierarchy have identical content. >> >> No subsystem may participate in more than one hierarchy. >> --- >> >> We'd also need to explain the behaviour of mount and umount with respect to views. The first time a cgroup mount is performed with a given set of subsystems, a hierarchy is created and a view of the hierarchy is created and associated with a cgroup filesystem at the mount point. Subsequently, if another cgroup mount is performed with the same set of subsystems, no new hierarchy is created but a new view of the existing hierarchy is created and associated with the cgroup filesystem at the new mount point. >> >> Unmounting a cgroup mount point destroys a particular view and destroys the hierarchy associated with the view if and only if the view is the only (remaining) view of the hierarchy. >> >> So does introducing the concept of a view really help? The wording in the patch does without the concept by allowing two "hierarchies" to have identical content (so they are indistinguishable). >> >> "views" don't seem to have any practical benefit. If we can avoid explaining concepts which have no use, we probably should. ;-) >> >> >> . >> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-14 13:30 ` Glyn Normington @ 2014-03-14 14:01 ` Tejun Heo 2014-03-14 14:04 ` Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Tejun Heo @ 2014-03-14 14:01 UTC (permalink / raw) To: Glyn Normington; +Cc: Li Zefan, linux-kernel, Steve Powell Hey, On Fri, Mar 14, 2014 at 01:30:29PM +0000, Glyn Normington wrote: > Hi Li > > Thanks! Inline... > > Tejun: Li agrees with our reasoning on the "views" issue, but we'd > like your feedback too in preparation for a new patch. I don't really mind as long as it's not wrong. Can't be worse than now anyway. Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] control groups: documentation improvements 2014-03-14 14:01 ` Tejun Heo @ 2014-03-14 14:04 ` Glyn Normington 2014-04-02 12:43 ` [PATCH v2] " Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-03-14 14:04 UTC (permalink / raw) To: Tejun Heo; +Cc: Li Zefan, linux-kernel, Steve Powell On 14/03/2014 14:01, Tejun Heo wrote: > Hey, > > On Fri, Mar 14, 2014 at 01:30:29PM +0000, Glyn Normington wrote: >> Hi Li >> >> Thanks! Inline... >> >> Tejun: Li agrees with our reasoning on the "views" issue, but we'd >> like your feedback too in preparation for a new patch. > I don't really mind as long as it's not wrong. Can't be worse than > now anyway. :-) > > Thanks. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] control groups: documentation improvements 2014-03-14 14:04 ` Glyn Normington @ 2014-04-02 12:43 ` Glyn Normington 2014-04-02 13:17 ` [PATCH v3] " Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-04-02 12:43 UTC (permalink / raw) To: Tejun Heo; +Cc: Li Zefan, linux-kernel, Steve Powell From: Glyn Normington<gnormington@gopivotal.com> Various clarifications to make the control groups documentation easier to understand, especially for newcomers. Delete the phrase "set of parameters" which obfuscates the definition of cgroup. Crisp up the definition of subsystem. Explain the term "partition". Describe the representation of the cgroup virtual filesystem, since this is not specifically described later in the document. Clarify that subsystems may be attached to multiple hierarchies, although this isn't very useful, and explain what happens. Document how to create a hierarchy with no associated subsystems. Use the term "task" in preference to "process" everywhere, for consistency. Related LKML thread: http://lkml.iu.edu//hypermail/linux/kernel/1402.0/02419.html Signed-off-by: Glyn Normington<gnormington@gopivotal.com> --- Kernel version: Linux 3.14-rc5. diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index 821de56..f086b70 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -24,6 +24,7 @@ CONTENTS: 2.1 Basic Usage 2.2 Attaching processes 2.3 Mounting hierarchies by name + 2.4 Mounting hierarchies with no subsystems 3. Kernel API 3.1 Overview 3.2 Synchronization @@ -43,24 +44,29 @@ specialized behaviour. Definitions: -A *cgroup* associates a set of tasks with a set of parameters for one -or more subsystems. +A *cgroup* associates a set of tasks with zero or more subsystems. -A *subsystem* is a module that makes use of the task grouping -facilities provided by cgroups to treat groups of tasks in -particular ways. A subsystem is typically a "resource controller" that +A *subsystem* is a module that treats the tasks of each cgroup in a +particular way. A subsystem is typically a "resource controller" that schedules a resource or applies per-cgroup limits, but it may be -anything that wants to act on a group of processes, e.g. a -virtualization subsystem. +anything that wants to act on a group of tasks, e.g. a virtualization +subsystem. -A *hierarchy* is a set of cgroups arranged in a tree, such that -every task in the system is in exactly one of the cgroups in the -hierarchy, and a set of subsystems; each subsystem has system-specific -state attached to each cgroup in the hierarchy. Each hierarchy has -an instance of the cgroup virtual filesystem associated with it. +A *hierarchy* is a non-empty set of cgroups arranged in a tree and a +set of subsystems such that the cgroups in the hierarchy +partition all the tasks in the system (in other words, every task in the +system is in exactly one of the cgroups in the hierarchy) and each +subsystem attaches its own state to each cgroup in the hierarchy. -At any one time there may be multiple active hierarchies of task -cgroups. Each hierarchy is a partition of all tasks in the system. +There may be zero or more active hierarchies. Each hierarchy has an +instance of the cgroup virtual filesystem associated with it. The tree +of cgroups is represented by the directory tree in the cgroup virtual +filesystem. + +The sets of subsystems participating in distinct hierarchies are either +identical or disjoint. If the sets are identical, the virtual filesystems +associated with the hierarchies have identical content and a change in +one is automatically reflected in all the others. User-level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to @@ -69,9 +75,9 @@ a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system. On their own, the only use for cgroups is for simple job -tracking. The intention is that other subsystems hook into the generic +tracking. The intention is that subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as -accounting/limiting the resources which processes in a cgroup can +accounting/limiting the resources which tasks in a cgroup can access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allow you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup. @@ -79,12 +85,12 @@ tasks in each cgroup. 1.2 Why are cgroups needed ? ---------------------------- -There are multiple efforts to provide process aggregations in the +There are multiple efforts to provide task aggregations in the Linux kernel, mainly for resource-tracking purposes. Such efforts include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server namespaces. These all require the basic notion of a -grouping/partitioning of processes, with newly forked processes ending -up in the same group (cgroup) as their parent process. +grouping/partitioning of tasks, with newly forked tasks ending +up in the same group (cgroup) as their parent task. The kernel cgroup patch provides the minimum essential kernel mechanisms required to efficiently implement such groups. It has @@ -418,11 +424,11 @@ To remove a cgroup, just use rmdir: # rmdir my_sub_cs This will fail if the cgroup is in use (has cgroups inside, or -has processes attached, or is held alive by other subsystem-specific +has tasks attached, or is held alive by other subsystem-specific reference). -2.2 Attaching processes ------------------------ +2.2 Attaching tasks +------------------- # /bin/echo PID > tasks @@ -450,7 +456,7 @@ move it into a new cgroup (possibly the root cgroup) by writing to the new cgroup's tasks file. Note: Due to some restrictions enforced by some cgroup subsystems, moving -a process to another cgroup can fail. +a task to another cgroup can fail. 2.3 Mounting hierarchies by name -------------------------------- @@ -471,6 +477,13 @@ you give a subsystem a name. The name of the subsystem appears as part of the hierarchy description in /proc/mounts and /proc/<pid>/cgroups. +2.4 Mounting hierarchies with no subsystems +------------------------------------------- + +To mount a hierarchy with no associated subsystems, specify a name +for the hierarchy and the dummy subsystem name "none". For example: + +# mount -t cgroup -o name=hier0,none hier0 /sys/fs/cgroup/h0 3. Kernel API ============= @@ -658,7 +671,7 @@ A: bash's builtin 'echo' command does not check calls to write() against errors. If you use it in the cgroup file system, you won't be able to tell whether a command succeeded or failed. -Q: When I attach processes, only the first of the line gets really attached ! +Q: When I attach tasks, only the first of the line gets really attached ! A: We can only return one error code per call to write(). So you should also put only ONE PID. diff --git a/patch.txt b/patch.txt new file mode 100644 index 0000000..2b21a8e --- /dev/null +++ b/patch.txt @@ -0,0 +1,109 @@ +diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt +index 821de56..003330a 100644 +--- a/Documentation/cgroups/cgroups.txt ++++ b/Documentation/cgroups/cgroups.txt +@@ -43,24 +43,29 @@ specialized behaviour. + + Definitions: + +-A *cgroup* associates a set of tasks with a set of parameters for one +-or more subsystems. ++A *cgroup* associates a set of tasks with one or more subsystems. + +-A *subsystem* is a module that makes use of the task grouping +-facilities provided by cgroups to treat groups of tasks in +-particular ways. A subsystem is typically a "resource controller" that ++A *subsystem* is a module that treats the tasks of each cgroup in a ++particular way. A subsystem is typically a "resource controller" that + schedules a resource or applies per-cgroup limits, but it may be +-anything that wants to act on a group of processes, e.g. a +-virtualization subsystem. ++anything that wants to act on a group of tasks, e.g. a virtualization ++subsystem. + +-A *hierarchy* is a set of cgroups arranged in a tree, such that +-every task in the system is in exactly one of the cgroups in the +-hierarchy, and a set of subsystems; each subsystem has system-specific +-state attached to each cgroup in the hierarchy. Each hierarchy has +-an instance of the cgroup virtual filesystem associated with it. ++A *hierarchy* is a non-empty set of cgroups arranged in a tree and a ++non-empty set of subsystems such that the cgroups in the hierarchy ++partition all the tasks in the system (in other words, every task in the ++system is in exactly one of the cgroups in the hierarchy) and each ++subsystem attaches its own state to each cgroup in the hierarchy. + +-At any one time there may be multiple active hierarchies of task +-cgroups. Each hierarchy is a partition of all tasks in the system. ++There may be zero or more active hierarchies. Each hierarchy has an ++instance of the cgroup virtual filesystem associated with it. The tree ++of cgroups is represented by the directory tree in the cgroup virtual ++filesystem. ++ ++The sets of subsystems participating in distinct hierarchies are either ++identical or disjoint. If the sets are identical, the virtual filesystems ++associated with the hierarchies have identical content and a change in ++one is automatically reflected in all the others. + + User-level code may create and destroy cgroups by name in an + instance of the cgroup virtual file system, specify and query to +@@ -69,9 +74,9 @@ a cgroup. Those creations and assignments only affect the hierarchy + associated with that instance of the cgroup file system. + + On their own, the only use for cgroups is for simple job +-tracking. The intention is that other subsystems hook into the generic ++tracking. The intention is that subsystems hook into the generic + cgroup support to provide new attributes for cgroups, such as +-accounting/limiting the resources which processes in a cgroup can ++accounting/limiting the resources which tasks in a cgroup can + access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allow + you to associate a set of CPUs and a set of memory nodes with the + tasks in each cgroup. +@@ -79,12 +84,12 @@ tasks in each cgroup. + 1.2 Why are cgroups needed ? + ---------------------------- + +-There are multiple efforts to provide process aggregations in the ++There are multiple efforts to provide task aggregations in the + Linux kernel, mainly for resource-tracking purposes. Such efforts + include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server + namespaces. These all require the basic notion of a +-grouping/partitioning of processes, with newly forked processes ending +-up in the same group (cgroup) as their parent process. ++grouping/partitioning of tasks, with newly forked tasks ending ++up in the same group (cgroup) as their parent task. + + The kernel cgroup patch provides the minimum essential kernel + mechanisms required to efficiently implement such groups. It has +@@ -418,11 +423,11 @@ To remove a cgroup, just use rmdir: + # rmdir my_sub_cs + + This will fail if the cgroup is in use (has cgroups inside, or +-has processes attached, or is held alive by other subsystem-specific ++has tasks attached, or is held alive by other subsystem-specific + reference). + +-2.2 Attaching processes +------------------------ ++2.2 Attaching tasks ++------------------- + + # /bin/echo PID > tasks + +@@ -450,7 +455,7 @@ move it into a new cgroup (possibly the root cgroup) by writing to the + new cgroup's tasks file. + + Note: Due to some restrictions enforced by some cgroup subsystems, moving +-a process to another cgroup can fail. ++a task to another cgroup can fail. + + 2.3 Mounting hierarchies by name + -------------------------------- +@@ -658,7 +663,7 @@ A: bash's builtin 'echo' command does not check calls to write() against + errors. If you use it in the cgroup file system, you won't be + able to tell whether a command succeeded or failed. + +-Q: When I attach processes, only the first of the line gets really attached ! ++Q: When I attach tasks, only the first of the line gets really attached ! + A: We can only return one error code per call to write(). So you should also + put only ONE PID. + diff --git a/patch2.txt b/patch2.txt new file mode 100644 index 0000000..2b21a8e --- /dev/null +++ b/patch2.txt @@ -0,0 +1,109 @@ +diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt +index 821de56..003330a 100644 +--- a/Documentation/cgroups/cgroups.txt ++++ b/Documentation/cgroups/cgroups.txt +@@ -43,24 +43,29 @@ specialized behaviour. + + Definitions: + +-A *cgroup* associates a set of tasks with a set of parameters for one +-or more subsystems. ++A *cgroup* associates a set of tasks with one or more subsystems. + +-A *subsystem* is a module that makes use of the task grouping +-facilities provided by cgroups to treat groups of tasks in +-particular ways. A subsystem is typically a "resource controller" that ++A *subsystem* is a module that treats the tasks of each cgroup in a ++particular way. A subsystem is typically a "resource controller" that + schedules a resource or applies per-cgroup limits, but it may be +-anything that wants to act on a group of processes, e.g. a +-virtualization subsystem. ++anything that wants to act on a group of tasks, e.g. a virtualization ++subsystem. + +-A *hierarchy* is a set of cgroups arranged in a tree, such that +-every task in the system is in exactly one of the cgroups in the +-hierarchy, and a set of subsystems; each subsystem has system-specific +-state attached to each cgroup in the hierarchy. Each hierarchy has +-an instance of the cgroup virtual filesystem associated with it. ++A *hierarchy* is a non-empty set of cgroups arranged in a tree and a ++non-empty set of subsystems such that the cgroups in the hierarchy ++partition all the tasks in the system (in other words, every task in the ++system is in exactly one of the cgroups in the hierarchy) and each ++subsystem attaches its own state to each cgroup in the hierarchy. + +-At any one time there may be multiple active hierarchies of task +-cgroups. Each hierarchy is a partition of all tasks in the system. ++There may be zero or more active hierarchies. Each hierarchy has an ++instance of the cgroup virtual filesystem associated with it. The tree ++of cgroups is represented by the directory tree in the cgroup virtual ++filesystem. ++ ++The sets of subsystems participating in distinct hierarchies are either ++identical or disjoint. If the sets are identical, the virtual filesystems ++associated with the hierarchies have identical content and a change in ++one is automatically reflected in all the others. + + User-level code may create and destroy cgroups by name in an + instance of the cgroup virtual file system, specify and query to +@@ -69,9 +74,9 @@ a cgroup. Those creations and assignments only affect the hierarchy + associated with that instance of the cgroup file system. + + On their own, the only use for cgroups is for simple job +-tracking. The intention is that other subsystems hook into the generic ++tracking. The intention is that subsystems hook into the generic + cgroup support to provide new attributes for cgroups, such as +-accounting/limiting the resources which processes in a cgroup can ++accounting/limiting the resources which tasks in a cgroup can + access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allow + you to associate a set of CPUs and a set of memory nodes with the + tasks in each cgroup. +@@ -79,12 +84,12 @@ tasks in each cgroup. + 1.2 Why are cgroups needed ? + ---------------------------- + +-There are multiple efforts to provide process aggregations in the ++There are multiple efforts to provide task aggregations in the + Linux kernel, mainly for resource-tracking purposes. Such efforts + include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server + namespaces. These all require the basic notion of a +-grouping/partitioning of processes, with newly forked processes ending +-up in the same group (cgroup) as their parent process. ++grouping/partitioning of tasks, with newly forked tasks ending ++up in the same group (cgroup) as their parent task. + + The kernel cgroup patch provides the minimum essential kernel + mechanisms required to efficiently implement such groups. It has +@@ -418,11 +423,11 @@ To remove a cgroup, just use rmdir: + # rmdir my_sub_cs + + This will fail if the cgroup is in use (has cgroups inside, or +-has processes attached, or is held alive by other subsystem-specific ++has tasks attached, or is held alive by other subsystem-specific + reference). + +-2.2 Attaching processes +------------------------ ++2.2 Attaching tasks ++------------------- + + # /bin/echo PID > tasks + +@@ -450,7 +455,7 @@ move it into a new cgroup (possibly the root cgroup) by writing to the + new cgroup's tasks file. + + Note: Due to some restrictions enforced by some cgroup subsystems, moving +-a process to another cgroup can fail. ++a task to another cgroup can fail. + + 2.3 Mounting hierarchies by name + -------------------------------- +@@ -658,7 +663,7 @@ A: bash's builtin 'echo' command does not check calls to write() against + errors. If you use it in the cgroup file system, you won't be + able to tell whether a command succeeded or failed. + +-Q: When I attach processes, only the first of the line gets really attached ! ++Q: When I attach tasks, only the first of the line gets really attached ! + A: We can only return one error code per call to write(). So you should also + put only ONE PID. + ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3] control groups: documentation improvements 2014-04-02 12:43 ` [PATCH v2] " Glyn Normington @ 2014-04-02 13:17 ` Glyn Normington 2014-04-16 21:00 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-04-02 13:17 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel From: Glyn Normington<gnormington@gopivotal.com> Various clarifications to make the control groups documentation easier to understand, especially for newcomers. Delete the phrase "set of parameters" which obfuscates the definition of cgroup. Crisp up the definition of subsystem. Explain the term "partition". Describe the representation of the cgroup virtual filesystem, since this is not specifically described later in the document. Clarify that subsystems may be attached to multiple hierarchies, although this isn't very useful, and explain what happens. Document how to create a hierarchy with no associated subsystems. Use the term "task" in preference to "process" everywhere, for consistency. Related LKML thread: http://lkml.iu.edu//hypermail/linux/kernel/1402.0/02419.html Signed-off-by: Glyn Normington<gnormington@gopivotal.com> --- Kernel version: Linux 3.14-rc5. diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index 821de56..f086b70 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -24,6 +24,7 @@ CONTENTS: 2.1 Basic Usage 2.2 Attaching processes 2.3 Mounting hierarchies by name + 2.4 Mounting hierarchies with no subsystems 3. Kernel API 3.1 Overview 3.2 Synchronization @@ -43,24 +44,29 @@ specialized behaviour. Definitions: -A *cgroup* associates a set of tasks with a set of parameters for one -or more subsystems. +A *cgroup* associates a set of tasks with zero or more subsystems. -A *subsystem* is a module that makes use of the task grouping -facilities provided by cgroups to treat groups of tasks in -particular ways. A subsystem is typically a "resource controller" that +A *subsystem* is a module that treats the tasks of each cgroup in a +particular way. A subsystem is typically a "resource controller" that schedules a resource or applies per-cgroup limits, but it may be -anything that wants to act on a group of processes, e.g. a -virtualization subsystem. +anything that wants to act on a group of tasks, e.g. a virtualization +subsystem. -A *hierarchy* is a set of cgroups arranged in a tree, such that -every task in the system is in exactly one of the cgroups in the -hierarchy, and a set of subsystems; each subsystem has system-specific -state attached to each cgroup in the hierarchy. Each hierarchy has -an instance of the cgroup virtual filesystem associated with it. +A *hierarchy* is a non-empty set of cgroups arranged in a tree and a +set of subsystems such that the cgroups in the hierarchy +partition all the tasks in the system (in other words, every task in the +system is in exactly one of the cgroups in the hierarchy) and each +subsystem attaches its own state to each cgroup in the hierarchy. -At any one time there may be multiple active hierarchies of task -cgroups. Each hierarchy is a partition of all tasks in the system. +There may be zero or more active hierarchies. Each hierarchy has an +instance of the cgroup virtual filesystem associated with it. The tree +of cgroups is represented by the directory tree in the cgroup virtual +filesystem. + +The sets of subsystems participating in distinct hierarchies are either +identical or disjoint. If the sets are identical, the virtual filesystems +associated with the hierarchies have identical content and a change in +one is automatically reflected in all the others. User-level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to @@ -69,9 +75,9 @@ a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system. On their own, the only use for cgroups is for simple job -tracking. The intention is that other subsystems hook into the generic +tracking. The intention is that subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as -accounting/limiting the resources which processes in a cgroup can +accounting/limiting the resources which tasks in a cgroup can access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allow you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup. @@ -79,12 +85,12 @@ tasks in each cgroup. 1.2 Why are cgroups needed ? ---------------------------- -There are multiple efforts to provide process aggregations in the +There are multiple efforts to provide task aggregations in the Linux kernel, mainly for resource-tracking purposes. Such efforts include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server namespaces. These all require the basic notion of a -grouping/partitioning of processes, with newly forked processes ending -up in the same group (cgroup) as their parent process. +grouping/partitioning of tasks, with newly forked tasks ending +up in the same group (cgroup) as their parent task. The kernel cgroup patch provides the minimum essential kernel mechanisms required to efficiently implement such groups. It has @@ -418,11 +424,11 @@ To remove a cgroup, just use rmdir: # rmdir my_sub_cs This will fail if the cgroup is in use (has cgroups inside, or -has processes attached, or is held alive by other subsystem-specific +has tasks attached, or is held alive by other subsystem-specific reference). -2.2 Attaching processes ------------------------ +2.2 Attaching tasks +------------------- # /bin/echo PID > tasks @@ -450,7 +456,7 @@ move it into a new cgroup (possibly the root cgroup) by writing to the new cgroup's tasks file. Note: Due to some restrictions enforced by some cgroup subsystems, moving -a process to another cgroup can fail. +a task to another cgroup can fail. 2.3 Mounting hierarchies by name -------------------------------- @@ -471,6 +477,13 @@ you give a subsystem a name. The name of the subsystem appears as part of the hierarchy description in /proc/mounts and /proc/<pid>/cgroups. +2.4 Mounting hierarchies with no subsystems +------------------------------------------- + +To mount a hierarchy with no associated subsystems, specify a name +for the hierarchy and the dummy subsystem name "none". For example: + +# mount -t cgroup -o name=hier0,none hier0 /sys/fs/cgroup/h0 3. Kernel API ============= @@ -658,7 +671,7 @@ A: bash's builtin 'echo' command does not check calls to write() against errors. If you use it in the cgroup file system, you won't be able to tell whether a command succeeded or failed. -Q: When I attach processes, only the first of the line gets really attached ! +Q: When I attach tasks, only the first of the line gets really attached ! A: We can only return one error code per call to write(). So you should also put only ONE PID. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3] control groups: documentation improvements 2014-04-02 13:17 ` [PATCH v3] " Glyn Normington @ 2014-04-16 21:00 ` Tejun Heo 2014-04-17 10:46 ` [PATCH v4] " Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Tejun Heo @ 2014-04-16 21:00 UTC (permalink / raw) To: Glyn Normington; +Cc: linux-kernel On Wed, Apr 02, 2014 at 02:17:13PM +0100, Glyn Normington wrote: > From: Glyn Normington<gnormington@gopivotal.com> > > Various clarifications to make the control groups documentation > easier to understand, especially for newcomers. Patch doesn't apply. Can you please re-generate the patch with 'diff -u' and send it as an attachment? Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4] control groups: documentation improvements 2014-04-16 21:00 ` Tejun Heo @ 2014-04-17 10:46 ` Glyn Normington 2014-04-17 13:16 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-04-17 10:46 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 744 bytes --] On 16/04/2014 22:00, Tejun Heo wrote: > On Wed, Apr 02, 2014 at 02:17:13PM +0100, Glyn Normington wrote: >> From: Glyn Normington<gnormington@gopivotal.com> >> >> Various clarifications to make the control groups documentation >> easier to understand, especially for newcomers. > Patch doesn't apply. Can you please re-generate the patch with 'diff > -u' and send it as an attachment? > > Thanks. From: Glyn Normington<gnormington@gopivotal.com> Please see the attachment for the patch created using 'diff -u'. I checked it applied ok with: patch -p1 < ~/patchfile at the latest HEAD (6ca2a88). Related LKML thread: http://lkml.iu.edu//hypermail/linux/kernel/1402.0/02419.html Signed-off-by: Glyn Normington<gnormington@gopivotal.com> [-- Attachment #2: patchfile --] [-- Type: text/plain, Size: 5863 bytes --] diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index 821de56..f086b70 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -24,6 +24,7 @@ CONTENTS: 2.1 Basic Usage 2.2 Attaching processes 2.3 Mounting hierarchies by name + 2.4 Mounting hierarchies with no subsystems 3. Kernel API 3.1 Overview 3.2 Synchronization @@ -43,24 +44,29 @@ specialized behaviour. Definitions: -A *cgroup* associates a set of tasks with a set of parameters for one -or more subsystems. +A *cgroup* associates a set of tasks with zero or more subsystems. -A *subsystem* is a module that makes use of the task grouping -facilities provided by cgroups to treat groups of tasks in -particular ways. A subsystem is typically a "resource controller" that +A *subsystem* is a module that treats the tasks of each cgroup in a +particular way. A subsystem is typically a "resource controller" that schedules a resource or applies per-cgroup limits, but it may be -anything that wants to act on a group of processes, e.g. a -virtualization subsystem. +anything that wants to act on a group of tasks, e.g. a virtualization +subsystem. -A *hierarchy* is a set of cgroups arranged in a tree, such that -every task in the system is in exactly one of the cgroups in the -hierarchy, and a set of subsystems; each subsystem has system-specific -state attached to each cgroup in the hierarchy. Each hierarchy has -an instance of the cgroup virtual filesystem associated with it. +A *hierarchy* is a non-empty set of cgroups arranged in a tree and a +set of subsystems such that the cgroups in the hierarchy +partition all the tasks in the system (in other words, every task in the +system is in exactly one of the cgroups in the hierarchy) and each +subsystem attaches its own state to each cgroup in the hierarchy. -At any one time there may be multiple active hierarchies of task -cgroups. Each hierarchy is a partition of all tasks in the system. +There may be zero or more active hierarchies. Each hierarchy has an +instance of the cgroup virtual filesystem associated with it. The tree +of cgroups is represented by the directory tree in the cgroup virtual +filesystem. + +The sets of subsystems participating in distinct hierarchies are either +identical or disjoint. If the sets are identical, the virtual filesystems +associated with the hierarchies have identical content and a change in +one is automatically reflected in all the others. User-level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to @@ -69,9 +75,9 @@ a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system. On their own, the only use for cgroups is for simple job -tracking. The intention is that other subsystems hook into the generic +tracking. The intention is that subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as -accounting/limiting the resources which processes in a cgroup can +accounting/limiting the resources which tasks in a cgroup can access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allow you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup. @@ -79,12 +85,12 @@ tasks in each cgroup. 1.2 Why are cgroups needed ? ---------------------------- -There are multiple efforts to provide process aggregations in the +There are multiple efforts to provide task aggregations in the Linux kernel, mainly for resource-tracking purposes. Such efforts include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server namespaces. These all require the basic notion of a -grouping/partitioning of processes, with newly forked processes ending -up in the same group (cgroup) as their parent process. +grouping/partitioning of tasks, with newly forked tasks ending +up in the same group (cgroup) as their parent task. The kernel cgroup patch provides the minimum essential kernel mechanisms required to efficiently implement such groups. It has @@ -418,11 +424,11 @@ To remove a cgroup, just use rmdir: # rmdir my_sub_cs This will fail if the cgroup is in use (has cgroups inside, or -has processes attached, or is held alive by other subsystem-specific +has tasks attached, or is held alive by other subsystem-specific reference). -2.2 Attaching processes ------------------------ +2.2 Attaching tasks +------------------- # /bin/echo PID > tasks @@ -450,7 +456,7 @@ move it into a new cgroup (possibly the root cgroup) by writing to the new cgroup's tasks file. Note: Due to some restrictions enforced by some cgroup subsystems, moving -a process to another cgroup can fail. +a task to another cgroup can fail. 2.3 Mounting hierarchies by name -------------------------------- @@ -471,6 +477,13 @@ you give a subsystem a name. The name of the subsystem appears as part of the hierarchy description in /proc/mounts and /proc/<pid>/cgroups. +2.4 Mounting hierarchies with no subsystems +------------------------------------------- + +To mount a hierarchy with no associated subsystems, specify a name +for the hierarchy and the dummy subsystem name "none". For example: + +# mount -t cgroup -o name=hier0,none hier0 /sys/fs/cgroup/h0 3. Kernel API ============= @@ -658,7 +671,7 @@ A: bash's builtin 'echo' command does not check calls to write() against errors. If you use it in the cgroup file system, you won't be able to tell whether a command succeeded or failed. -Q: When I attach processes, only the first of the line gets really attached ! +Q: When I attach tasks, only the first of the line gets really attached ! A: We can only return one error code per call to write(). So you should also put only ONE PID. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4] control groups: documentation improvements 2014-04-17 10:46 ` [PATCH v4] " Glyn Normington @ 2014-04-17 13:16 ` Tejun Heo 2014-04-17 13:45 ` Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Tejun Heo @ 2014-04-17 13:16 UTC (permalink / raw) To: Glyn Normington; +Cc: linux-kernel Hello, Glyn. On Thu, Apr 17, 2014 at 11:46:13AM +0100, Glyn Normington wrote: > +There may be zero or more active hierarchies. Each hierarchy has an > +instance of the cgroup virtual filesystem associated with it. The tree > +of cgroups is represented by the directory tree in the cgroup virtual > +filesystem. > + > +The sets of subsystems participating in distinct hierarchies are either > +identical or disjoint. If the sets are identical, the virtual filesystems > +associated with the hierarchies have identical content and a change in > +one is automatically reflected in all the others. I can't say I'm a big fan of these definitions in mathematical terms. They're so precise and useless at the same time. That said, I don't really understand the last paragraph. Is it trying to talk about multiple mounts of a single hierarchy? Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4] control groups: documentation improvements 2014-04-17 13:16 ` Tejun Heo @ 2014-04-17 13:45 ` Glyn Normington 2014-04-17 13:55 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-04-17 13:45 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel Hi Tejun On 17/04/2014 14:16, Tejun Heo wrote: > Hello, Glyn. > > On Thu, Apr 17, 2014 at 11:46:13AM +0100, Glyn Normington wrote: >> +There may be zero or more active hierarchies. Each hierarchy has an >> +instance of the cgroup virtual filesystem associated with it. The tree >> +of cgroups is represented by the directory tree in the cgroup virtual >> +filesystem. >> + >> +The sets of subsystems participating in distinct hierarchies are either >> +identical or disjoint. If the sets are identical, the virtual filesystems >> +associated with the hierarchies have identical content and a change in >> +one is automatically reflected in all the others. > I can't say I'm a big fan of these definitions in mathematical terms. > They're so precise and useless at the same time. We would like to be both precise and readable. Please point out the "useless" bits and we'll try to make them better. > That said, I don't > really understand the last paragraph. Is it trying to talk about > multiple mounts of a single hierarchy? Yes. When this came up earlier, Li Zefan thought we could delete that paragraph "because we all know the same filesystem can have more than one mount point and cgroupfs is no different". But since the underlying cgroupfs is only visible through the representation(s) at its mount points, we'd prefer to keep the paragraph. How about changing the paragraph to say: A given hierarchy may be associated with more than one virtual filesystem, in which case each of the virtual filesystems has identical contents to the others. ? > > Thanks. Regards, Glyn ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4] control groups: documentation improvements 2014-04-17 13:45 ` Glyn Normington @ 2014-04-17 13:55 ` Tejun Heo 2014-04-17 14:51 ` Glyn Normington 0 siblings, 1 reply; 18+ messages in thread From: Tejun Heo @ 2014-04-17 13:55 UTC (permalink / raw) To: Glyn Normington; +Cc: linux-kernel Hello, On Thu, Apr 17, 2014 at 02:45:40PM +0100, Glyn Normington wrote: > >>+The sets of subsystems participating in distinct hierarchies are either > >>+identical or disjoint. If the sets are identical, the virtual filesystems > >>+associated with the hierarchies have identical content and a change in > >>+one is automatically reflected in all the others. > > > >I can't say I'm a big fan of these definitions in mathematical terms. > >They're so precise and useless at the same time. > > We would like to be both precise and readable. Please point out the > "useless" bits and we'll try to make them better. I think it becomes useless when mathematical precision is pursued beyond the necessary point, forcing people to parse and analyze the description to reach a concept she already has full understanding of. Just using those pre-established concepts is far more efficient use of brain power than trying to craft the precise mathematical definition from vacuum and, [un]surprisingly, leads to lower rate of miscommunication. It's kinda useless to go through all the precise terms to re-define hierarchical grouping of tasks, which is both accurate and intuitive enough. Adding extra descriptions to clarify ambiguities and just to reinforce the concept would be fine but trying to build the concept from the ground is silly at best. Starting with something intuitive and refining it is a far better approach. > A given hierarchy may be associated with more than one virtual > filesystem, in which case each of the virtual filesystems has > identical contents to the others. The above is inaccurate because there really is just one filesystem (represented by a single super block). There are multiple mount points of the same file system, but still just single file system. ie. mounting /dev/sdb2 in multiple places doens't really create multiple file systems. Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4] control groups: documentation improvements 2014-04-17 13:55 ` Tejun Heo @ 2014-04-17 14:51 ` Glyn Normington 2014-04-17 14:57 ` Tejun Heo 0 siblings, 1 reply; 18+ messages in thread From: Glyn Normington @ 2014-04-17 14:51 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-kernel Hi Tejun On 17/04/2014 14:55, Tejun Heo wrote: > Hello, > > On Thu, Apr 17, 2014 at 02:45:40PM +0100, Glyn Normington wrote: >>>> +The sets of subsystems participating in distinct hierarchies are either >>>> +identical or disjoint. If the sets are identical, the virtual filesystems >>>> +associated with the hierarchies have identical content and a change in >>>> +one is automatically reflected in all the others. >>> I can't say I'm a big fan of these definitions in mathematical terms. >>> They're so precise and useless at the same time. >> We would like to be both precise and readable. Please point out the >> "useless" bits and we'll try to make them better. > I think it becomes useless when mathematical precision is pursued > beyond the necessary point, forcing people to parse and analyze the > description to reach a concept she already has full understanding of. > Just using those pre-established concepts is far more efficient use of > brain power than trying to craft the precise mathematical definition > from vacuum and, [un]surprisingly, leads to lower rate of > miscommunication. > > It's kinda useless to go through all the precise terms to re-define > hierarchical grouping of tasks, which is both accurate and intuitive > enough. Adding extra descriptions to clarify ambiguities and just to > reinforce the concept would be fine but trying to build the concept > from the ground is silly at best. Starting with something intuitive > and refining it is a far better approach. I'm sorry you feel this way. A couple of us (full disclosure: both mathematicians) tried hard to get a precise understanding of cgroups from cgroups.txt, but several terms remained vague until we had done some experiments and discussed our findings on the mailing list. The aim of the patch is to crisp up the definitions of those terms for other newcomers, so they won't have to go through the same exercise. Interestingly, after we had understood the terms, cgroups.txt seemed much clearer than it did originally. But that's because we were tending to read our new-found understanding into the text. Might you not be doing the same? So, how would you like to proceed? You could reject the patch outright if you think our experience is unrepresentative. Or, for the benefit of other newcomers, we are willing to try reworking the parts you find unreadable if you could kindly pick them out. The choice is yours. :-) > >> A given hierarchy may be associated with more than one virtual >> filesystem, in which case each of the virtual filesystems has >> identical contents to the others. > The above is inaccurate because there really is just one filesystem > (represented by a single super block). There are multiple mount > points of the same file system, but still just single file system. > ie. mounting /dev/sdb2 in multiple places doens't really create > multiple file systems. Thanks for the clarification. If you agree to proceed, we should be able to find a simpler way to cover this paragraph. > > Thanks. > Regards, Glyn ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4] control groups: documentation improvements 2014-04-17 14:51 ` Glyn Normington @ 2014-04-17 14:57 ` Tejun Heo 0 siblings, 0 replies; 18+ messages in thread From: Tejun Heo @ 2014-04-17 14:57 UTC (permalink / raw) To: Glyn Normington; +Cc: linux-kernel Hello, Glyn. On Thu, Apr 17, 2014 at 03:51:32PM +0100, Glyn Normington wrote: > >It's kinda useless to go through all the precise terms to re-define > >hierarchical grouping of tasks, which is both accurate and intuitive > >enough. Adding extra descriptions to clarify ambiguities and just to > >reinforce the concept would be fine but trying to build the concept > >from the ground is silly at best. Starting with something intuitive > >and refining it is a far better approach. > > I'm sorry you feel this way. A couple of us (full disclosure: both > mathematicians) tried hard to get a precise understanding of cgroups > from cgroups.txt, but several terms remained vague until we had done > some experiments and discussed our findings on the mailing list. > > The aim of the patch is to crisp up the definitions of those terms > for other newcomers, so they won't have to go through the same > exercise. Oh, don't get me wrong. The current documentation is neither intuitive or precise. I have hard time understanding what it's saying, so probably even just increasing precision is an improvement. > Interestingly, after we had understood the terms, cgroups.txt seemed > much clearer than it did originally. But that's because we were > tending to read our new-found understanding into the text. Might you > not be doing the same? Sure thing. Please go ahead and improve it. It's not good at all in all fronts. > So, how would you like to proceed? You could reject the patch > outright if you think our experience is unrepresentative. Or, for > the benefit of other newcomers, we are willing to try reworking the > parts you find unreadable if you could kindly pick them out. The > choice is yours. :-) Again, I think it's an improvement but was just hoping you could add a bit more intuitive explanations so that it's more approchable. The two properties aren't mutually exclusive. > Thanks for the clarification. If you agree to proceed, we should be > able to find a simpler way to cover this paragraph. I really don't mind being verbose if it makes things clearer and easier to understand. Thanks! -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-04-17 14:57 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-10 11:39 [PATCH] control groups: documentation improvements Glyn Normington 2014-03-10 14:07 ` Tejun Heo 2014-03-10 14:17 ` Glyn Normington 2014-03-10 14:20 ` Tejun Heo 2014-03-13 16:04 ` Glyn Normington 2014-03-14 1:33 ` Li Zefan 2014-03-14 13:30 ` Glyn Normington 2014-03-14 14:01 ` Tejun Heo 2014-03-14 14:04 ` Glyn Normington 2014-04-02 12:43 ` [PATCH v2] " Glyn Normington 2014-04-02 13:17 ` [PATCH v3] " Glyn Normington 2014-04-16 21:00 ` Tejun Heo 2014-04-17 10:46 ` [PATCH v4] " Glyn Normington 2014-04-17 13:16 ` Tejun Heo 2014-04-17 13:45 ` Glyn Normington 2014-04-17 13:55 ` Tejun Heo 2014-04-17 14:51 ` Glyn Normington 2014-04-17 14:57 ` Tejun Heo
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.