* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <CAAAKZwu67VMiZgdpp=i5p7zyGbOHGHXwF_iprufGPzTLkkUF2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-10-28 23:30 ` Andrew Morton [not found] ` <20111028163021.1ce61f8a.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Andrew Morton @ 2011-10-28 23:30 UTC (permalink / raw) To: Tim Hockin Cc: Aditya Kali, Frederic Weisbecker, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Containers On Tue, 25 Oct 2011 13:06:35 -0700 Tim Hockin <thockin-Rl2oBbRerpQdnm+yROfE0A@public.gmane.org> wrote: > On Tue, Oct 4, 2011 at 3:01 PM, Andrew Morton <akpm00-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > On Mon, __3 Oct 2011 21:07:02 +0200 > > Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > >> Hi Andrew, > >> > >> This contains minor changes, mostly documentation and changelog > >> updates, off-case build fix, and a code optimization in > >> res_counter_common_ancestor(). > > > > I'd normally duck a patch series like this when we're at -rc8 and ask > > for it to be resent late in -rc1. __But I was feeling frisky so I > > grabbed this lot for a bit of testing and will sit on it until -rc1. > > > > I'm still not convinced that the kernel has a burning need for a "task > > counter subsystem". __Someone convince me that we should merge this! > > We have real (accidental) DoS situations which happen because we don't > have this. It usually takes the form of some library no re-joining > threads. We end up deploying a few apps linked against this library, > and suddenly we're in trouble on a machine. Except, this being > Google, we're in trouble on a lot of machines. This is a bit foggy. I think you mean that machines are experiencing accidental forkbombs? > There may be other ways to cobble this sort of safety together, but > they are less appealing for various reasons. cgroups are how we > control groups of related pids. > > I'd really love to be able to use this. Has it been confirmed that this implementation actually solves the problem? ie: tested a bit? btw, Frederic told me that this version of the patchset had some serious problem so it's on hold pending an upgrade, regardless of other matters. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20111028163021.1ce61f8a.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111028163021.1ce61f8a.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> @ 2011-10-29 9:38 ` Glauber Costa [not found] ` <CAA6-i6o0SPfZJDx4SRR1hY-He0L6zHuv0saH6EaE7Mrc2HF6PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-11-03 17:00 ` Frederic Weisbecker 1 sibling, 1 reply; 17+ messages in thread From: Glauber Costa @ 2011-10-29 9:38 UTC (permalink / raw) To: Andrew Morton Cc: Aditya Kali, Tim Hockin, Frederic Weisbecker, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Containers On Sat, Oct 29, 2011 at 1:30 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Tue, 25 Oct 2011 13:06:35 -0700 > Tim Hockin <thockin@hockin.org> wrote: > >> On Tue, Oct 4, 2011 at 3:01 PM, Andrew Morton <akpm00@gmail.com> wrote: >> > On Mon, __3 Oct 2011 21:07:02 +0200 >> > Frederic Weisbecker <fweisbec@gmail.com> wrote: >> > >> >> Hi Andrew, >> >> >> >> This contains minor changes, mostly documentation and changelog >> >> updates, off-case build fix, and a code optimization in >> >> res_counter_common_ancestor(). >> > >> > I'd normally duck a patch series like this when we're at -rc8 and ask >> > for it to be resent late in -rc1. __But I was feeling frisky so I >> > grabbed this lot for a bit of testing and will sit on it until -rc1. >> > >> > I'm still not convinced that the kernel has a burning need for a "task >> > counter subsystem". __Someone convince me that we should merge this! >> >> We have real (accidental) DoS situations which happen because we don't >> have this. It usually takes the form of some library no re-joining >> threads. We end up deploying a few apps linked against this library, >> and suddenly we're in trouble on a machine. Except, this being >> Google, we're in trouble on a lot of machines. > > This is a bit foggy. I think you mean that machines are experiencing > accidental forkbombs? > >> There may be other ways to cobble this sort of safety together, but >> they are less appealing for various reasons. cgroups are how we >> control groups of related pids. >> In the end of the day, all cgroups are just a group of tasks. So I don't really get the need to have a cgroup to control the number of tasks in the system. Why don't we just allow all cgroups to have a limit on the number of tasks it can hold? -- Sent from my Atari. _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CAA6-i6o0SPfZJDx4SRR1hY-He0L6zHuv0saH6EaE7Mrc2HF6PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <CAA6-i6o0SPfZJDx4SRR1hY-He0L6zHuv0saH6EaE7Mrc2HF6PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-03 16:49 ` Frederic Weisbecker [not found] ` <20111103164917.GF8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Frederic Weisbecker @ 2011-11-03 16:49 UTC (permalink / raw) To: Glauber Costa Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Andrew Morton, Containers On Sat, Oct 29, 2011 at 11:38:25AM +0200, Glauber Costa wrote: > On Sat, Oct 29, 2011 at 1:30 AM, Andrew Morton > <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote: > > On Tue, 25 Oct 2011 13:06:35 -0700 > > Tim Hockin <thockin-Rl2oBbRerpQdnm+yROfE0A@public.gmane.org> wrote: > > > >> On Tue, Oct 4, 2011 at 3:01 PM, Andrew Morton <akpm00-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > >> > On Mon, __3 Oct 2011 21:07:02 +0200 > >> > Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > >> > > >> >> Hi Andrew, > >> >> > >> >> This contains minor changes, mostly documentation and changelog > >> >> updates, off-case build fix, and a code optimization in > >> >> res_counter_common_ancestor(). > >> > > >> > I'd normally duck a patch series like this when we're at -rc8 and ask > >> > for it to be resent late in -rc1. __But I was feeling frisky so I > >> > grabbed this lot for a bit of testing and will sit on it until -rc1. > >> > > >> > I'm still not convinced that the kernel has a burning need for a "task > >> > counter subsystem". __Someone convince me that we should merge this! > >> > >> We have real (accidental) DoS situations which happen because we don't > >> have this. It usually takes the form of some library no re-joining > >> threads. We end up deploying a few apps linked against this library, > >> and suddenly we're in trouble on a machine. Except, this being > >> Google, we're in trouble on a lot of machines. > > > > This is a bit foggy. I think you mean that machines are experiencing > > accidental forkbombs? > > > >> There may be other ways to cobble this sort of safety together, but > >> they are less appealing for various reasons. cgroups are how we > >> control groups of related pids. > >> > > In the end of the day, all cgroups are just a group of tasks. So I don't really > get the need to have a cgroup to control the number of tasks in the system. > > Why don't we just allow all cgroups to have a limit on the number of > tasks it can hold? Not sure what you mean. You would prefer to have this as a core feature in cgroups rather than a subsystem? ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20111103164917.GF8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111103164917.GF8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org> @ 2011-11-03 16:58 ` Glauber Costa [not found] ` <4EB2C852.6020706-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Glauber Costa @ 2011-11-03 16:58 UTC (permalink / raw) To: Frederic Weisbecker Cc: Aditya Kali, Tim Hockin, Glauber Costa, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Andrew Morton, Paul Turner, Containers On 11/03/2011 02:49 PM, Frederic Weisbecker wrote: > On Sat, Oct 29, 2011 at 11:38:25AM +0200, Glauber Costa wrote: >> On Sat, Oct 29, 2011 at 1:30 AM, Andrew Morton >> <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote: >>> On Tue, 25 Oct 2011 13:06:35 -0700 >>> Tim Hockin<thockin-Rl2oBbRerpQdnm+yROfE0A@public.gmane.org> wrote: >>> >>>> On Tue, Oct 4, 2011 at 3:01 PM, Andrew Morton<akpm00-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>>> On Mon, __3 Oct 2011 21:07:02 +0200 >>>>> Frederic Weisbecker<fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>>> >>>>>> Hi Andrew, >>>>>> >>>>>> This contains minor changes, mostly documentation and changelog >>>>>> updates, off-case build fix, and a code optimization in >>>>>> res_counter_common_ancestor(). >>>>> >>>>> I'd normally duck a patch series like this when we're at -rc8 and ask >>>>> for it to be resent late in -rc1. __But I was feeling frisky so I >>>>> grabbed this lot for a bit of testing and will sit on it until -rc1. >>>>> >>>>> I'm still not convinced that the kernel has a burning need for a "task >>>>> counter subsystem". __Someone convince me that we should merge this! >>>> >>>> We have real (accidental) DoS situations which happen because we don't >>>> have this. It usually takes the form of some library no re-joining >>>> threads. We end up deploying a few apps linked against this library, >>>> and suddenly we're in trouble on a machine. Except, this being >>>> Google, we're in trouble on a lot of machines. >>> >>> This is a bit foggy. I think you mean that machines are experiencing >>> accidental forkbombs? >>> >>>> There may be other ways to cobble this sort of safety together, but >>>> they are less appealing for various reasons. cgroups are how we >>>> control groups of related pids. >>>> >> >> In the end of the day, all cgroups are just a group of tasks. So I don't really >> get the need to have a cgroup to control the number of tasks in the system. >> >> Why don't we just allow all cgroups to have a limit on the number of >> tasks it can hold? > > Not sure what you mean. You would prefer to have this as a core feature in > cgroups rather than a subsystem? Well, ideally, I think we should put some effort in trying to reduce the number of different possible cgroups subsystems. I do see how keeping a different cgroup here adds flexibility. However, this flexibility very easily translate into performance losses. The reason is that when more than one cgroup needs to control and update some piece of data, because we can't assume anything about the set of processes they have, we have to walk hierarchies upwards multiple times - they are potentially different. See for instance what happens with cpu vs cpuacct, that I am trying to get rid of. Because you are controlling tasks, and tasks are the main building block of all cgroups, I think you should at least consider either using a cgroup property, or bundling this into some other cgroup, like cpu - where there is already some need, albeit minor, to keep track of the number of process in a group. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <4EB2C852.6020706-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <4EB2C852.6020706-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> @ 2011-11-03 17:02 ` Paul Menage [not found] ` <CALdu-PDY8zpXYM3V9KRk4f2NyGevfNnuaWVdoT-qzSHOK--K3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Paul Menage @ 2011-11-03 17:02 UTC (permalink / raw) To: Glauber Costa Cc: Aditya Kali, Kay Sievers, Tim Hockin, Frederic Weisbecker, Containers, Johannes Weiner, LKML, Oleg Nesterov, Glauber Costa, Tejun Heo, Andrew Morton, Paul Turner On Thu, Nov 3, 2011 at 9:58 AM, Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: > > Because you are controlling tasks, and tasks are the main building block of > all cgroups, I think you should at least consider either using > a cgroup property, I don't see how making it a core cgroup property would remove the need to walk the hierarchy. Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CALdu-PDY8zpXYM3V9KRk4f2NyGevfNnuaWVdoT-qzSHOK--K3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <CALdu-PDY8zpXYM3V9KRk4f2NyGevfNnuaWVdoT-qzSHOK--K3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-03 17:06 ` Glauber Costa [not found] ` <4EB2CA03.7030601-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Glauber Costa @ 2011-11-03 17:06 UTC (permalink / raw) To: Paul Menage Cc: Aditya Kali, Kay Sievers, Tim Hockin, Frederic Weisbecker, Containers, Johannes Weiner, LKML, Oleg Nesterov, Glauber Costa, Tejun Heo, Andrew Morton, Paul Turner On 11/03/2011 03:02 PM, Paul Menage wrote: > On Thu, Nov 3, 2011 at 9:58 AM, Glauber Costa<glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: >> >> Because you are controlling tasks, and tasks are the main building block of >> all cgroups, I think you should at least consider either using >> a cgroup property, > > I don't see how making it a core cgroup property would remove the need > to walk the hierarchy. > Sorry if I wasn't clear: It removes the need to walk multiple independent hierarchies. The walk is done only once. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <4EB2CA03.7030601-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <4EB2CA03.7030601-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> @ 2011-11-03 17:28 ` Paul Menage [not found] ` <CALdu-PA2CDoeUMoNd1y44p_QzphX8J4s6NDcSyVC-rP1HGYwkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Paul Menage @ 2011-11-03 17:28 UTC (permalink / raw) To: Glauber Costa Cc: Aditya Kali, Kay Sievers, Tim Hockin, Frederic Weisbecker, Containers, Johannes Weiner, LKML, Oleg Nesterov, Glauber Costa, Tejun Heo, Andrew Morton, Paul Turner On Thu, Nov 3, 2011 at 10:06 AM, Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: > Sorry if I wasn't clear: It removes the need to walk multiple independent > hierarchies. The walk is done only once. You're talking about at fork time, and the concern is the cache footprint involved in walking up the parent pointer chain? Isn't that an argument against multiple hierarchies (which is a decision for the admin), rather than against more subsystem flexibility? If multiple subsystems on the same hierarchy each need to walk up the pointer chain on the same event, then after the first subsystem has done so the chain will be in cache for any subsequent walks from other subsystems. Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CALdu-PA2CDoeUMoNd1y44p_QzphX8J4s6NDcSyVC-rP1HGYwkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <CALdu-PA2CDoeUMoNd1y44p_QzphX8J4s6NDcSyVC-rP1HGYwkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-03 17:35 ` Glauber Costa [not found] ` <4EB2D0F2.40309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Glauber Costa @ 2011-11-03 17:35 UTC (permalink / raw) To: Paul Menage Cc: Aditya Kali, Kay Sievers, Tim Hockin, Frederic Weisbecker, Containers, Johannes Weiner, LKML, Oleg Nesterov, Glauber Costa, Tejun Heo, Andrew Morton, Paul Turner On 11/03/2011 03:28 PM, Paul Menage wrote: > On Thu, Nov 3, 2011 at 10:06 AM, Glauber Costa<glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: >> Sorry if I wasn't clear: It removes the need to walk multiple independent >> hierarchies. The walk is done only once. > > You're talking about at fork time, and the concern is the cache > footprint involved in walking up the parent pointer chain? Yes, we can say this is my main concern. > Isn't that an argument against multiple hierarchies (which is a > decision for the admin), rather than against more subsystem > flexibility? Not always it is a decision for the admin. In most cases, it is a constraint of the problem. For containers - take lxc as an example, the most reasonable thing to do is to grab all cgroups subsystems available, and contain them. > If multiple subsystems on the same hierarchy each need to > walk up the pointer chain on the same event, then after the first > subsystem has done so the chain will be in cache for any subsequent > walks from other subsystems. No, it won't. Precisely because different subsystems have completely independent pointer chains. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <4EB2D0F2.40309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <4EB2D0F2.40309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> @ 2011-11-03 17:56 ` Paul Menage [not found] ` <CALdu-PDbJ69FayXSd-kjAMX8AKEroZytPapxsUn8GFsz-z1omQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Paul Menage @ 2011-11-03 17:56 UTC (permalink / raw) To: Glauber Costa Cc: Aditya Kali, Kay Sievers, Tim Hockin, Frederic Weisbecker, Containers, Johannes Weiner, LKML, Oleg Nesterov, Glauber Costa, Tejun Heo, Andrew Morton, Paul Turner On Thu, Nov 3, 2011 at 10:35 AM, Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: > >> If multiple subsystems on the same hierarchy each need to >> walk up the pointer chain on the same event, then after the first >> subsystem has done so the chain will be in cache for any subsequent >> walks from other subsystems. > > No, it won't. Precisely because different subsystems have completely > independent pointer chains. Because they're following res_counter parent pointers, etc, rather than using the single cgroups parent pointer chain? So if that's the problem, rather than artificially constrain flexibility in order to improve micro-benchmarks, why not come up with approaches that keep both the flexibility and the performance? - make res_counter hierarchies be explicitly defined via the cgroup parent pointers, rather than an parent pointer hidden inside the res_counter. So the cgroup parent chain traversal would all be along the common parent pointers (and res_counter would be one pointer smaller). - allow subsystems to specify that they need a small amount of data that can be accessed efficiently up the cgroup chain. (Many subsystems wouldn't need this, and those that do would likely only need it for a subset of their per-cgroup data). Pack this data into as few cachelines as possible, allocated as a single lump of memory per cgroup. Each subsystem would know where in that allocation its private data lay (it would be the same offset for every cgroup, although dynamically determined at runtime based on the number of subsystems mounted on that hierarchy) Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CALdu-PDbJ69FayXSd-kjAMX8AKEroZytPapxsUn8GFsz-z1omQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <CALdu-PDbJ69FayXSd-kjAMX8AKEroZytPapxsUn8GFsz-z1omQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-04 13:17 ` Glauber Costa 0 siblings, 0 replies; 17+ messages in thread From: Glauber Costa @ 2011-11-04 13:17 UTC (permalink / raw) To: Paul Menage Cc: Aditya Kali, Kay Sievers, Tim Hockin, Frederic Weisbecker, Containers, Johannes Weiner, LKML, Oleg Nesterov, cgroups-u79uwXL29TY76Z2rM5mHXA, Glauber Costa, Tejun Heo, Andrew Morton, Paul Turner On 11/03/2011 03:56 PM, Paul Menage wrote: > On Thu, Nov 3, 2011 at 10:35 AM, Glauber Costa<glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: >> >>> If multiple subsystems on the same hierarchy each need to >>> walk up the pointer chain on the same event, then after the first >>> subsystem has done so the chain will be in cache for any subsequent >>> walks from other subsystems. >> >> No, it won't. Precisely because different subsystems have completely >> independent pointer chains. > > Because they're following res_counter parent pointers, etc, rather > than using the single cgroups parent pointer chain? No. Because: /sys/fs/cgroup/my_subsys/ /sys/fs/cgroup/my_subsys/foo1 /sys/fs/cgroup/my_subsys/foo2 /sys/fs/cgroup/my_subsys/foo1/bar1 and: /sys/fs/cgroup/my_subsys2/ /sys/fs/cgroup/my_subsys2/foo1 /sys/fs/cgroup/my_subsys2/foo1/bar1 /sys/fs/cgroup/my_subsys2/foo1/bar2 Are completely independent pointer chains. the only thing they share is the pointer to the root. And that's irrelevant in the pointer dance. Also note that I used cpu and cpuacct as an example, and they don't use res_counters. > So if that's the problem, rather than artificially constrain > flexibility in order to improve micro-benchmarks, why not come up with > approaches that keep both the flexibility and the performance? Well, I am not opposed to that even if you happen to agree on what I said above. But in the end of the day, with many cgroups appearing, it may not be about just micro benchmarks. It is hard to draw the line, but I believe that avoiding creating new cgroups subsystems when possible plays in our favor. Specifically for this one, my arguments are: * cgroups are a task-grouping entity * therefore, all cgroups already do some task manipulation in attach/dettach * all cgroups subsystem already can register a fork handler Adding a fork limit as a cgroup property seems a logical step to me based on that. If, however, we are really creating this, I think we'd be better of referring to this as a "Task Controller" rather than a "Task Counter". Then at least in the near future when people start trying to limit other task-related resources, this can serve as a natural placeholder for this. (See the syscall limiting that Lukasz is trying to achieve) > > - make res_counter hierarchies be explicitly defined via the cgroup > parent pointers, rather than an parent pointer hidden inside the > res_counter. So the cgroup parent chain traversal would all be along > the common parent pointers (and res_counter would be one pointer > smaller). > > > - allow subsystems to specify that they need a small amount of data > that can be accessed efficiently up the cgroup chain. (Many subsystems > wouldn't need this, and those that do would likely only need it for a > subset of their per-cgroup data). Pack this data into as few > cachelines as possible, allocated as a single lump of memory per > cgroup. Each subsystem would know where in that allocation its private > data lay (it would be the same offset for every cgroup, although > dynamically determined at runtime based on the number of subsystems > mounted on that hierarchy) I thought about this second one myself. I am not yet convinced this would be a win, but I believe there are chances. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111028163021.1ce61f8a.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> 2011-10-29 9:38 ` Glauber Costa @ 2011-11-03 17:00 ` Frederic Weisbecker [not found] ` <20111103170038.GG8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org> 1 sibling, 1 reply; 17+ messages in thread From: Frederic Weisbecker @ 2011-11-03 17:00 UTC (permalink / raw) To: Andrew Morton, Tim Hockin Cc: Aditya Kali, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Containers On Fri, Oct 28, 2011 at 04:30:21PM -0700, Andrew Morton wrote: > On Tue, 25 Oct 2011 13:06:35 -0700 > Tim Hockin <thockin-Rl2oBbRerpQdnm+yROfE0A@public.gmane.org> wrote: > > > On Tue, Oct 4, 2011 at 3:01 PM, Andrew Morton <akpm00-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > On Mon, __3 Oct 2011 21:07:02 +0200 > > > Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > >> Hi Andrew, > > >> > > >> This contains minor changes, mostly documentation and changelog > > >> updates, off-case build fix, and a code optimization in > > >> res_counter_common_ancestor(). > > > > > > I'd normally duck a patch series like this when we're at -rc8 and ask > > > for it to be resent late in -rc1. __But I was feeling frisky so I > > > grabbed this lot for a bit of testing and will sit on it until -rc1. > > > > > > I'm still not convinced that the kernel has a burning need for a "task > > > counter subsystem". __Someone convince me that we should merge this! > > > > We have real (accidental) DoS situations which happen because we don't > > have this. It usually takes the form of some library no re-joining > > threads. We end up deploying a few apps linked against this library, > > and suddenly we're in trouble on a machine. Except, this being > > Google, we're in trouble on a lot of machines. > > This is a bit foggy. I think you mean that machines are experiencing > accidental forkbombs? I'd like to hear about more details as well. > > > There may be other ways to cobble this sort of safety together, but > > they are less appealing for various reasons. cgroups are how we > > control groups of related pids. > > > > I'd really love to be able to use this. > > Has it been confirmed that this implementation actually solves the > problem? ie: tested a bit? > > btw, Frederic told me that this version of the patchset had some > serious problem so it's on hold pending an upgrade, regardless of other > matters. Yep. The particular issue is https://lkml.org/lkml/2011/10/13/532 Li Zefan proposed a fix (https://lkml.org/lkml/2011/10/17/26) which I'm currently reworking. But then I'd love it if you can test this subsystem to see if it really matches your needs, Tim. Thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20111103170038.GG8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111103170038.GG8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org> @ 2011-11-04 2:57 ` Li Zefan [not found] ` <4EB3549D.5090404-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Li Zefan @ 2011-11-04 2:57 UTC (permalink / raw) To: Frederic Weisbecker Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Andrew Morton, Containers >>> There may be other ways to cobble this sort of safety together, but >>> they are less appealing for various reasons. cgroups are how we >>> control groups of related pids. >>> >>> I'd really love to be able to use this. >> >> Has it been confirmed that this implementation actually solves the >> problem? ie: tested a bit? >> >> btw, Frederic told me that this version of the patchset had some >> serious problem so it's on hold pending an upgrade, regardless of other >> matters. > > Yep. The particular issue is https://lkml.org/lkml/2011/10/13/532 > > Li Zefan proposed a fix (https://lkml.org/lkml/2011/10/17/26) which I'm > currently reworking. > We really need to coordinate cgroup patches. I mean, the patchset+fix conflict with Tejun's work, and the conflict is not trivial. > But then I'd love it if you can test this subsystem to see if it really matches > your needs, Tim. > ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <4EB3549D.5090404-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <4EB3549D.5090404-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> @ 2011-11-04 12:37 ` Frederic Weisbecker 0 siblings, 0 replies; 17+ messages in thread From: Frederic Weisbecker @ 2011-11-04 12:37 UTC (permalink / raw) To: Li Zefan Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Tejun Heo, Andrew Morton, Containers On Fri, Nov 04, 2011 at 10:57:33AM +0800, Li Zefan wrote: > >>> There may be other ways to cobble this sort of safety together, but > >>> they are less appealing for various reasons. cgroups are how we > >>> control groups of related pids. > >>> > >>> I'd really love to be able to use this. > >> > >> Has it been confirmed that this implementation actually solves the > >> problem? ie: tested a bit? > >> > >> btw, Frederic told me that this version of the patchset had some > >> serious problem so it's on hold pending an upgrade, regardless of other > >> matters. > > > > Yep. The particular issue is https://lkml.org/lkml/2011/10/13/532 > > > > Li Zefan proposed a fix (https://lkml.org/lkml/2011/10/17/26) which I'm > > currently reworking. > > > > We really need to coordinate cgroup patches. I mean, the patchset+fix conflict > with Tejun's work, and the conflict is not trivial. Either Tejun targets for -mm, or I try to get my patches into the pm tree where Tejun's patches are aimed. I just would like to keep Andrew in the process of my patches somehow. Also it might be time for you and/or Paul Menage to run a cgroup git tree, what do you think :) ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <1317668832-10784-1-git-send-email-fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <1317668832-10784-1-git-send-email-fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2011-12-13 15:58 ` Tejun Heo [not found] ` <20111213155848.GI25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Tejun Heo @ 2011-12-13 15:58 UTC (permalink / raw) To: Frederic Weisbecker Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Andrew Morton, Containers Hello, Frederic. Can you please rebase the patchset on top of cgroup/for-3.3? I primarily like the idea of being able to track process usage w/ cgroup and enforce limits on it but hope that it could somehow integrate w/ cgroup freezer. ie. trigger freezer if it goes over limit and let the userland tool / administrator deal with the frozen cgroup. I'm planning on extending cgroup freezer such that it supports recursive freezing and killing of frozen tasks. If we can fit task counters into that, we'll have general method of handling problematic cgroups - freeze, notify userland and let it deal with it. Thank you. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20111213155848.GI25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111213155848.GI25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2011-12-13 19:06 ` Frederic Weisbecker [not found] ` <20111213190642.GB2421-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Frederic Weisbecker @ 2011-12-13 19:06 UTC (permalink / raw) To: Tejun Heo Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Andrew Morton, Containers On Tue, Dec 13, 2011 at 07:58:48AM -0800, Tejun Heo wrote: > Hello, Frederic. > > Can you please rebase the patchset on top of cgroup/for-3.3? Sure. But please note its fate is still under discussion. Whether we want it upstream is still a running debate. But I certainly need to rebase against your tree. > I primarily like the idea of being able to track process usage w/ cgroup > and enforce limits on it but hope that it could somehow integrate w/ > cgroup freezer. ie. trigger freezer if it goes over limit and let the > userland tool / administrator deal with the frozen cgroup. I'm > planning on extending cgroup freezer such that it supports recursive > freezing and killing of frozen tasks. If we can fit task counters > into that, we'll have general method of handling problematic cgroups - > freeze, notify userland and let it deal with it. Hmm, so you suggest a kernel trigger that freeze the cgroup when the task limit is reached? What about rather implementing register_event() for the tasks.usage such that the user can be notified using eventfd when the limit is reached. Then it would be up to the user to decide to freeze or any other thing. Sounds like a more generic solution. Hm? Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20111213190642.GB2421-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111213190642.GB2421-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org> @ 2011-12-13 20:49 ` Tejun Heo [not found] ` <20111213204918.GK25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Tejun Heo @ 2011-12-13 20:49 UTC (permalink / raw) To: Frederic Weisbecker Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Andrew Morton, Containers Hello, On Tue, Dec 13, 2011 at 08:06:46PM +0100, Frederic Weisbecker wrote: > On Tue, Dec 13, 2011 at 07:58:48AM -0800, Tejun Heo wrote: > > Can you please rebase the patchset on top of cgroup/for-3.3? > > Sure. But please note its fate is still under discussion. Whether > we want it upstream is still a running debate. But I certainly > need to rebase against your tree. I see. > > I primarily like the idea of being able to track process usage w/ cgroup > > and enforce limits on it but hope that it could somehow integrate w/ > > cgroup freezer. ie. trigger freezer if it goes over limit and let the > > userland tool / administrator deal with the frozen cgroup. I'm > > planning on extending cgroup freezer such that it supports recursive > > freezing and killing of frozen tasks. If we can fit task counters > > into that, we'll have general method of handling problematic cgroups - > > freeze, notify userland and let it deal with it. > > Hmm, so you suggest a kernel trigger that freeze the cgroup when the > task limit is reached? Yeah, something like that. I'm not really sure about how it would actually work tho. > What about rather implementing register_event() for the tasks.usage such > that the user can be notified using eventfd when the limit is reached. > Then it would be up to the user to decide to freeze or any other thing. > Sounds like a more generic solution. Maybe, the problem would be how to ensure that the userland manager can respond fast enough (whatever that means...). Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20111213204918.GK25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 00/10] cgroups: Task counter subsystem v6 [not found] ` <20111213204918.GK25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2011-12-14 15:07 ` Frederic Weisbecker 0 siblings, 0 replies; 17+ messages in thread From: Frederic Weisbecker @ 2011-12-14 15:07 UTC (permalink / raw) To: Tejun Heo Cc: Aditya Kali, Tim Hockin, Paul Menage, Kay Sievers, LKML, Oleg Nesterov, Johannes Weiner, Andrew Morton, Containers On Tue, Dec 13, 2011 at 12:49:18PM -0800, Tejun Heo wrote: > Hello, > > On Tue, Dec 13, 2011 at 08:06:46PM +0100, Frederic Weisbecker wrote: > > On Tue, Dec 13, 2011 at 07:58:48AM -0800, Tejun Heo wrote: > > > Can you please rebase the patchset on top of cgroup/for-3.3? > > > > Sure. But please note its fate is still under discussion. Whether > > we want it upstream is still a running debate. But I certainly > > need to rebase against your tree. > > I see. > > > > I primarily like the idea of being able to track process usage w/ cgroup > > > and enforce limits on it but hope that it could somehow integrate w/ > > > cgroup freezer. ie. trigger freezer if it goes over limit and let the > > > userland tool / administrator deal with the frozen cgroup. I'm > > > planning on extending cgroup freezer such that it supports recursive > > > freezing and killing of frozen tasks. If we can fit task counters > > > into that, we'll have general method of handling problematic cgroups - > > > freeze, notify userland and let it deal with it. > > > > Hmm, so you suggest a kernel trigger that freeze the cgroup when the > > task limit is reached? > > Yeah, something like that. I'm not really sure about how it would > actually work tho. > > > What about rather implementing register_event() for the tasks.usage such > > that the user can be notified using eventfd when the limit is reached. > > Then it would be up to the user to decide to freeze or any other thing. > > Sounds like a more generic solution. > > Maybe, the problem would be how to ensure that the userland manager > can respond fast enough (whatever that means...). Yeah that's part of the goal of the task counter: limit the spreading of the forkbomb soon enough such that the machine stays responsive and the admin can react accordingly. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-12-14 15:07 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1317668832-10784-1-git-send-email-fweisbec@gmail.com>
[not found] ` <20111004150111.e9337268.akpm00@gmail.com>
[not found] ` <CAAAKZwu67VMiZgdpp=i5p7zyGbOHGHXwF_iprufGPzTLkkUF2A@mail.gmail.com>
[not found] ` <CAAAKZwu67VMiZgdpp=i5p7zyGbOHGHXwF_iprufGPzTLkkUF2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-10-28 23:30 ` [PATCH 00/10] cgroups: Task counter subsystem v6 Andrew Morton
[not found] ` <20111028163021.1ce61f8a.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2011-10-29 9:38 ` Glauber Costa
[not found] ` <CAA6-i6o0SPfZJDx4SRR1hY-He0L6zHuv0saH6EaE7Mrc2HF6PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-03 16:49 ` Frederic Weisbecker
[not found] ` <20111103164917.GF8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-11-03 16:58 ` Glauber Costa
[not found] ` <4EB2C852.6020706-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-11-03 17:02 ` Paul Menage
[not found] ` <CALdu-PDY8zpXYM3V9KRk4f2NyGevfNnuaWVdoT-qzSHOK--K3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-03 17:06 ` Glauber Costa
[not found] ` <4EB2CA03.7030601-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-11-03 17:28 ` Paul Menage
[not found] ` <CALdu-PA2CDoeUMoNd1y44p_QzphX8J4s6NDcSyVC-rP1HGYwkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-03 17:35 ` Glauber Costa
[not found] ` <4EB2D0F2.40309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-11-03 17:56 ` Paul Menage
[not found] ` <CALdu-PDbJ69FayXSd-kjAMX8AKEroZytPapxsUn8GFsz-z1omQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-04 13:17 ` Glauber Costa
2011-11-03 17:00 ` Frederic Weisbecker
[not found] ` <20111103170038.GG8198-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-11-04 2:57 ` Li Zefan
[not found] ` <4EB3549D.5090404-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2011-11-04 12:37 ` Frederic Weisbecker
[not found] ` <1317668832-10784-1-git-send-email-fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2011-12-13 15:58 ` Tejun Heo
[not found] ` <20111213155848.GI25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-13 19:06 ` Frederic Weisbecker
[not found] ` <20111213190642.GB2421-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-12-13 20:49 ` Tejun Heo
[not found] ` <20111213204918.GK25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-14 15:07 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox