* boot cgroup questions @ 2008-03-12 1:23 Max Krasnyansky 2008-03-12 1:27 ` Paul Menage 0 siblings, 1 reply; 42+ messages in thread From: Max Krasnyansky @ 2008-03-12 1:23 UTC (permalink / raw) To: Paul Jackson, Paul Menage, Ingo Molnar, Peter Zijlstra; +Cc: LKML Folks, Concept of 'boot' cgroup was discussed as part of the cpuset/cpuisol lkml threads. In short 'boot' group is very much like the 'root' or toplevel group. ie It contains all tasks, and 'boot' cpuset contains all cpus, mem nodes, irqs, etc. The difference is that it can be easily shrunk if needed, where as toplevel/root group cannot. I just wanted to make sure that we still want to create 'boot' cgroup during kernel init instead of doing it in the user-space. After looking into this a little bit I'm thinking of creating 'boot' cgroup right after cpuset_init_smp() (init/main.c:841). Just before do_basic_setup() which creates work queues and stuff. The thing is though that the very next thing we do there is run early userspace. Which begs the question, shouldn't we just do it from early user-space then ? It'd be very simple to mount cgroup, create 'boot' group and move all the tasks in there. So kernel or early-userspace ? If kernel. Paul M, do you have a suggestion as to what's the best way of creating a cgroup without mounting cgroup fs. Seems like there is currently no easy way for doing that. I probably missed it. Thanx Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 1:23 boot cgroup questions Max Krasnyansky @ 2008-03-12 1:27 ` Paul Menage 2008-03-12 2:34 ` Max Krasnyansky 0 siblings, 1 reply; 42+ messages in thread From: Paul Menage @ 2008-03-12 1:27 UTC (permalink / raw) To: Max Krasnyansky; +Cc: Paul Jackson, Ingo Molnar, Peter Zijlstra, LKML On Tue, Mar 11, 2008 at 6:23 PM, Max Krasnyansky <maxk@qualcomm.com> wrote: > The thing is though that the very next thing we do there is run early > userspace. Which begs the question, shouldn't we just do it from early > user-space then ? Seems simplest to me. We have an early boot script that creates a "system" cpuset and moves all tasks into it. It seems to work fine for us. Paul ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 1:27 ` Paul Menage @ 2008-03-12 2:34 ` Max Krasnyansky 2008-03-12 2:36 ` Paul Menage 0 siblings, 1 reply; 42+ messages in thread From: Max Krasnyansky @ 2008-03-12 2:34 UTC (permalink / raw) To: Paul Menage; +Cc: Paul Jackson, Ingo Molnar, Peter Zijlstra, LKML Paul Menage wrote: > On Tue, Mar 11, 2008 at 6:23 PM, Max Krasnyansky <maxk@qualcomm.com> wrote: >> The thing is though that the very next thing we do there is run early >> userspace. Which begs the question, shouldn't we just do it from early >> user-space then ? > > Seems simplest to me. We have an early boot script that creates a > "system" cpuset and moves all tasks into it. It seems to work fine for > us. Suppose we were to do it from kernel. What's the right way to create a cgroup without mounting a cgroupfs ? I just want to play with it. There are a couple of advantages that I see for doing it from kernel. We can move 'kthreadd' and idle threads into the 'boot' cgroup early on and therefor later on won't even have to iterate through the tasks and stuff. Whereas user-space has to iterate through tasks and be smart about threads that are pinned and stuff. Not a big deal but if kernel code is simple enough maybe it makes sense. So, any pointers. How do I do create_cgroup() without fs mounted ? Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 2:34 ` Max Krasnyansky @ 2008-03-12 2:36 ` Paul Menage 2008-03-12 2:53 ` Max Krasnyansky 0 siblings, 1 reply; 42+ messages in thread From: Paul Menage @ 2008-03-12 2:36 UTC (permalink / raw) To: Max Krasnyansky; +Cc: Paul Jackson, Ingo Molnar, Peter Zijlstra, LKML On Tue, Mar 11, 2008 at 7:34 PM, Max Krasnyansky <maxk@qualcomm.com> wrote: > > Suppose we were to do it from kernel. What's the right way to create a cgroup > without mounting a cgroupfs ? There isn't really a way, but you could always kern_mount() a filesystem inside the kernel. > I just want to play with it. There are a couple of advantages that I see for > doing it from kernel. We can move 'kthreadd' and idle threads into the 'boot' > cgroup early on and therefor later on won't even have to iterate through the > tasks and stuff. Would this be done based on some boot commandline option? I don't think you'd want to do it unconditionally. Paul ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 2:36 ` Paul Menage @ 2008-03-12 2:53 ` Max Krasnyansky 2008-03-12 3:09 ` Paul Menage 0 siblings, 1 reply; 42+ messages in thread From: Max Krasnyansky @ 2008-03-12 2:53 UTC (permalink / raw) To: Paul Menage; +Cc: Paul Jackson, Ingo Molnar, Peter Zijlstra, LKML Paul Menage wrote: > On Tue, Mar 11, 2008 at 7:34 PM, Max Krasnyansky <maxk@qualcomm.com> wrote: >> Suppose we were to do it from kernel. What's the right way to create a cgroup >> without mounting a cgroupfs ? > > There isn't really a way, but you could always kern_mount() a > filesystem inside the kernel. Aha, that's what I was missing. kern_mount(). Cool :). >> I just want to play with it. There are a couple of advantages that I see for >> doing it from kernel. We can move 'kthreadd' and idle threads into the 'boot' >> cgroup early on and therefor later on won't even have to iterate through the >> tasks and stuff. > > Would this be done based on some boot commandline option? I don't > think you'd want to do it unconditionally. Hmm, I believe the original discussion was about doing it unconditionally. Why not I guess ? It probably won't even affect your existing scripts since they will be able to move tasks into another set just like they do now. The only thing I can think of is that if your scripts use sched_load_balance then they will now have to unset it in the 'boot' set as well. Otherwise since the 'boot' set will be non-exclusive (cpus and mems) it should not really affect anything. So what's your concern with unconditional 'boot' cgroup/cpuset ? Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 2:53 ` Max Krasnyansky @ 2008-03-12 3:09 ` Paul Menage 2008-03-12 3:39 ` Max Krasnyansky 2008-03-12 4:59 ` Paul Jackson 0 siblings, 2 replies; 42+ messages in thread From: Paul Menage @ 2008-03-12 3:09 UTC (permalink / raw) To: Max Krasnyansky; +Cc: Paul Jackson, Ingo Molnar, Peter Zijlstra, LKML On Tue, Mar 11, 2008 at 7:53 PM, Max Krasnyansky <maxk@qualcomm.com> wrote: > It probably won't even affect your existing scripts since > they will be able to move tasks into another set just like they do now. My boot scripts look in /dev/cpuset/tasks to find processes to move into the system cpuset. So that would break them. > they will now have to unset it in the 'boot' set as well. That can break existing userspace, so I presume PaulJ isn't in favour of this change. > Otherwise since the > 'boot' set will be non-exclusive (cpus and mems) it should not really affect > anything. Apart from other cpusets that *are* mem_exclusive or cpu_exclusive. > So what's your concern with unconditional 'boot' cgroup/cpuset ? The exclusivity problem, as above. Which subsystems are you going to include in this boot hierarchy? Userspace is going to have to be aware of the fact that there's a cpusets hierarchy which might have to be dismantled if it wants to set up something different. Paul ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 3:09 ` Paul Menage @ 2008-03-12 3:39 ` Max Krasnyansky 2008-03-12 4:59 ` Paul Jackson 1 sibling, 0 replies; 42+ messages in thread From: Max Krasnyansky @ 2008-03-12 3:39 UTC (permalink / raw) To: Paul Menage; +Cc: Paul Jackson, Ingo Molnar, Peter Zijlstra, LKML Paul Menage wrote: > On Tue, Mar 11, 2008 at 7:53 PM, Max Krasnyansky <maxk@qualcomm.com> wrote: >> It probably won't even affect your existing scripts since >> they will be able to move tasks into another set just like they do now. > > My boot scripts look in /dev/cpuset/tasks to find processes to move > into the system cpuset. So that would break them. I see. I assumed you just iterate through /proc/[0-9]* >> they will now have to unset it in the 'boot' set as well. > > That can break existing userspace, so I presume PaulJ isn't in favour > of this change. My impression was that he was ok with changing his stuff. But I maybe completely wrong of course. I'm actually perfectly fine with making it conditional. Maybe something like bootcpuset=1 ? >> Otherwise since the >> 'boot' set will be non-exclusive (cpus and mems) it should not really affect >> anything. > > Apart from other cpusets that *are* mem_exclusive or cpu_exclusive. Hold on, if you move all the tasks ... Oh, never mind :). You mean that you won't be able to create any cpusets that must be exclusive unless you nuke 'boot' set. Makes sense. >> So what's your concern with unconditional 'boot' cgroup/cpuset ? > > The exclusivity problem, as above. Yes I agree. If this 'boot' set is unconditional user-space tools will have to change. As I mentioned above I totally do not mind if is is conditional. Any other opinions out there ? > > Which subsystems are you going to include in this boot hierarchy? > Userspace is going to have to be aware of the fact that there's a > cpusets hierarchy which might have to be dismantled if it wants to set > up something different. I was going to only include 'cpusets'. Does it make sense for anything else ? Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 3:09 ` Paul Menage 2008-03-12 3:39 ` Max Krasnyansky @ 2008-03-12 4:59 ` Paul Jackson 2008-03-12 18:24 ` Max Krasnyanskiy 2008-03-12 19:16 ` Paul Menage 1 sibling, 2 replies; 42+ messages in thread From: Paul Jackson @ 2008-03-12 4:59 UTC (permalink / raw) To: Paul Menage; +Cc: maxk, mingo, a.p.zijlstra, linux-kernel Paul M wrote: > > they will now have to unset it in the 'boot' set as well. > > That can break existing userspace, so I presume PaulJ isn't in favour > of this change. You're right - I don't favor it. Using the 'cpus' in one or more cpusets to determine both: 1) which CPUs can receive an irq, and 2) resolving conflicts in such irq placement, excessively overloads the cpuset hierarchy, breaking existing userspace, as Paul M notes. If you don't have any other cpuset hierarchy you need to use, and so don't really otherwise care what your cpuset hierarchy is, then I suppose this works just fine. But if you also need to use the cpuset hierarchy to define nested subsets of CPUs and Memory Nodes, for the purposes of controlling which tasks can run where (the original and still primary motivation for cpusets) then one can only conveniently specify those trivial irq configurations that happen to exactly conform with that hierarchy (that exactly want to make use of some of the same sets of CPUs, and that don't depend on the hierarchy to resolve conflicts in overlapping irq directives). Almost any non-trivial use of cpusets for both irq directivity and CPU and Memory placement would complicate both hierarchies, forcing unending confusion and breakage on the existing cpuset users. Some examples: Let's say I have three cpusets defining the CPU and Memory Node sets in which I want to place my tasks: /dev/cpuset/A /dev/cpuset/B /dev/cpuset/C and I want a particular set of irqs to be directed to the CPUs in A and B, but not C. Well -- guess I can duplicate the irqs settings. But don't tell me to use a 'boot' cpuset, as in: /dev/cpuset/boot/A /dev/cpuset/boot/B /dev/cpuset/C to accomplish this, as that intrudes in the hierarchy, breaking user code. If my irq isolation needs don't exactly partition along the 'cpus' settings in A, B and C, then not even duplication helps. If the 'irqs' in /dev/cpuset/A/Z (where Z's cpus are a proper subset of A's) don't match the 'irqs' in /dev/cpuset/A, then I have further confusions resulting from conflicting irq directives. (If your proposal handles all the above, without forcing changes on the cpuset hierarchy, then I misread it - in that case, sorry.) Paul M has already proposed pulling apart the binding of CPUs and Memory Nodes, in the underlying cgroups, as he apparently has cases in which the legacy connection of those two into a single cpuset hierarchy is an undesired constraint on (complication of) the hierarchy. That's more likely the direction in which we should be proceeding -- making these hierarchies independent, not entwining them. This additional overloading of the current cpuset hierarchy might handle the simple case you need. But that's only because you don't have conflicting needs for the cpuset hierarchy. Hopefully, Paul M will be able to view with some sense of humor that I am complaining that this proposal of yourself (and Peter Z's earlier patches) isn't general enough, even as I have complained of some of some other recent cgroup proposals of Paul M that their increased generality isn't sufficient to justify their subtle incompatibilities. At a minimum, as in my proposal (http://lkml.org/lkml/2008/3/6/512) of last week, one needs some mechanism independent of the cpuset hierarchy to resolve conflicts in these irq directives. As you may recall, that proposal named each set of irqs, let each cpuset specify which named set of irqs applied to its CPUs, and encoded the precedence N of each named list of irqs in the filename '/dev/cpuset/irqs.N.name' of the file listing the irqs in that named set. Then one can specify irqs for each cpuset, and have some way to specify the precedence of these irq specifications, without overloading the cpuset hierarchy. Even this minimum proposal might be insufficient, if one has needs to specify irq directives for sets of CPUs that are not otherwise present in the cpuset hierarchy. Observe that this proposal does not handle the next to the last example case above. I am not yet convinced that this deficiency is a show stopper. It might be. The other direction considered, making this its own cgroup, -seemed- to fail as well, as someone, I forget whom, noted. Cgroups attach tasks to sets of things. We aren't trying to attach tasks to anything. We're trying to attach irqs to CPUs. We are trying now to treat irqs as 'pseudo-tasks', but that forces the irq hierarchy to be a subset of the CPU hierarchy, due to overloading the 'cpus' set. This is the problem noted above. Paul M -- could we take a different tack here -- extend cgroups to map -either- tasks or irqs to the managed resources? Then irqs would be managed by a cgroup hierarchy that mapped irqs to a subsystem specific attribute of 'cpus' (resembling the cpuset 'cpus'). If the hierarchy one needed for irqs was a nice subset of ones cpuset hierarchy, one might even mount both cgroup subsystems on the same mount, so long as we could work out what it means for two cgroup subsystems to share the same subsystem specific attribute, 'cpus' in this case. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 4:59 ` Paul Jackson @ 2008-03-12 18:24 ` Max Krasnyanskiy 2008-03-12 18:57 ` Paul Jackson 2008-03-12 19:16 ` Paul Menage 1 sibling, 1 reply; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-12 18:24 UTC (permalink / raw) To: Paul Jackson; +Cc: Paul Menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: > Paul M wrote: >>> they will now have to unset it in the 'boot' set as well. >> That can break existing userspace, so I presume PaulJ isn't in favour >> of this change. > > You're right - I don't favor it. Hmm, I think we're mixing two different threads here. 1. How to map irq affinity handling onto cpusets. 2. Whether and how to create in kernel 'boot' cgroup/cpuset. They are somewhat orthogonal imho. In a sense that no mater how we decide to handle irqs (even if we do not do them under cpusets at all) we may still want 'boot' group. As I mentioned at the beginning of this thread 'boot group/set is basically just a convenience feature. The only difference between root/top group is that 'boot' group can be dynamically resized and moved. Ok. So the rest of the email is mostly about irqs. It'd be nice if it was in the other thread (cpuset: irq affinity support) but I'm ok with replying here. > Using the 'cpus' in one or more cpusets to determine both: > 1) which CPUs can receive an irq, and > 2) resolving conflicts in such irq placement, > excessively overloads the cpuset hierarchy, breaking existing > userspace, as Paul M notes. > If you don't have any other cpuset hierarchy you need to use, and > so don't really otherwise care what your cpuset hierarchy is, then > I suppose this works just fine. I'm not sure #2 is a concern. With the latest couset irq handling patches conflict resolution is very simple. "irq can belong to a single cpuset at a time". > But if you also need to use the cpuset hierarchy to define nested > subsets of CPUs and Memory Nodes, for the purposes of controlling > which tasks can run where (the original and still primary motivation > for cpusets) then one can only conveniently specify those trivial > irq configurations that happen to exactly conform with that hierarchy > (that exactly want to make use of some of the same sets of CPUs, and > that don't depend on the hierarchy to resolve conflicts in overlapping > irq directives). I do not think we need overlapping irq directives. > Almost any non-trivial use of cpusets for both irq directivity and CPU > and Memory placement would complicate both hierarchies, forcing > unending confusion and breakage on the existing cpuset users. I'm not sure what breakage you're talking about. But lets talk examples I guess. See below. > Some examples: > Let's say I have three cpusets defining the CPU and Memory Node > sets in which I want to place my tasks: > > /dev/cpuset/A > /dev/cpuset/B > /dev/cpuset/C > > and I want a particular set of irqs to be directed to the CPUs in A > and B, but not C. Well -- guess I can duplicate the irqs settings. > > But don't tell me to use a 'boot' cpuset, as in: > > /dev/cpuset/boot/A > /dev/cpuset/boot/B > /dev/cpuset/C > > to accomplish this, as that intrudes in the hierarchy, breaking > user code. > > If my irq isolation needs don't exactly partition along the > 'cpus' settings in A, B and C, then not even duplication helps. > > If the 'irqs' in /dev/cpuset/A/Z (where Z's cpus are a proper > subset of A's) don't match the 'irqs' in /dev/cpuset/A, then I > have further confusions resulting from conflicting irq directives. How is that any different from tasks ? Exact same example right back at you. Suppose I have a task that needs to run in A and B but not C. In fact if you look at the example that I provided in the other thread I already have such an app. In my current apps different threads have to run in different cpusets. And yes I think the way to solve that is to use more complex cpuset hierarchy like the one you used above. I would not necessarily mix in the 'boot' set here. I mean if people want to subdivide it that's fine but they do not have to. I mean people can just nuke the 'boot' group/set and create something else. > (If your proposal handles all the above, without forcing changes > on the cpuset hierarchy, then I misread it - in that case, sorry.) It does not force any changes. irqs handled just like tasks and if people have complex partitioning requirements they may have to use more complicated hierarchies. > Paul M has already proposed pulling apart the binding of CPUs and > Memory Nodes, in the underlying cgroups, as he apparently has cases in > which the legacy connection of those two into a single cpuset hierarchy > is an undesired constraint on (complication of) the hierarchy. > That's more likely the direction in which we should be proceeding -- > making these hierarchies independent, not entwining them. > > This additional overloading of the current cpuset hierarchy might > handle the simple case you need. But that's only because you don't > have conflicting needs for the cpuset hierarchy. > > Hopefully, Paul M will be able to view with some sense of humor that I > am complaining that this proposal of yourself (and Peter Z's earlier > patches) isn't general enough, even as I have complained of some of > some other recent cgroup proposals of Paul M that their increased > generality isn't sufficient to justify their subtle incompatibilities. > > At a minimum, as in my proposal (http://lkml.org/lkml/2008/3/6/512) of > last week, one needs some mechanism independent of the cpuset hierarchy > to resolve conflicts in these irq directives. As you may recall, > that proposal named each set of irqs, let each cpuset specify which > named set of irqs applied to its CPUs, and encoded the precedence N > of each named list of irqs in the filename '/dev/cpuset/irqs.N.name' > of the file listing the irqs in that named set. Then one can specify > irqs for each cpuset, and have some way to specify the precedence of > these irq specifications, without overloading the cpuset hierarchy. > > Even this minimum proposal might be insufficient, if one has needs > to specify irq directives for sets of CPUs that are not otherwise > present in the cpuset hierarchy. Observe that this proposal does > not handle the next to the last example case above. I am not yet > convinced that this deficiency is a show stopper. It might be. That (ie additional sets of irqs) seems like an major overkill to me. Probably because I do not think that there are any conflicts to resolve in the first place. As I explained above if we treat irqs just like tasks (from cpuset perspective) then same exact rules and limitations apply. Irq can be assigned to a single cpuset at a time. Complex requirements can be solved either by deeper cgroup/cpuset hierarchies or worst case if there is something totally wacky constraint people always have an option of assigning irq to the top cpuset and using /proc/irq/N/smp_affinity interface to select which cpus it can run on. > The other direction considered, making this its own cgroup, -seemed- > to fail as well, as someone, I forget whom, noted. Cgroups attach > tasks to sets of things. We aren't trying to attach tasks to anything. > We're trying to attach irqs to CPUs. We are trying now to treat irqs > as 'pseudo-tasks', but that forces the irq hierarchy to be a subset > of the CPU hierarchy, due to overloading the 'cpus' set. This is the > problem noted above. > > Paul M -- could we take a different tack here -- extend cgroups to map > -either- tasks or irqs to the managed resources? Then irqs would be > managed by a cgroup hierarchy that mapped irqs to a subsystem specific > attribute of 'cpus' (resembling the cpuset 'cpus'). If the hierarchy > one needed for irqs was a nice subset of ones cpuset hierarchy, one > might even mount both cgroup subsystems on the same mount, so long > as we could work out what it means for two cgroup subsystems to share > the same subsystem specific attribute, 'cpus' in this case. Hold on. How does this help if at the end of the day 'cpus' are still shared between the irq and task groups ? We'd still have exact same constrains. btw I'm starting to gravitate back towards my original solution (ie cpu_map that tells which cpus can be used by kernel and irqs). If you remember I was totally against using cpusets/cgroup exactly because they are designed to handle tasks. You guys convinced me that we can extend them and that it's a better way to go about. Yet after weeks of discussion we seem to be taking about adding more and more stuff. Anyway, I think treating irqs as tasks and enforcing the same rules and constraints should handle most scenarios even complex ones at the expense of deeper cpuset hierarchies. Adding new 'cgroups' or 'sets' just for irqs seems overkill. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 18:24 ` Max Krasnyanskiy @ 2008-03-12 18:57 ` Paul Jackson 2008-03-12 19:11 ` Max Krasnyanskiy 0 siblings, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-12 18:57 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max K wrote: > How is that any different from tasks ? Exact same example right back at you. > Suppose I have a task that needs to run in A and B but not C. Can't happen. Each task belongs to exactly one cpuset, no exceptions. That's why you can't "treat irqs just like tasks". -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 18:57 ` Paul Jackson @ 2008-03-12 19:11 ` Max Krasnyanskiy 2008-03-12 19:32 ` Paul Jackson 0 siblings, 1 reply; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-12 19:11 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: > Max K wrote: >> How is that any different from tasks ? Exact same example right back at you. >> Suppose I have a task that needs to run in A and B but not C. > > Can't happen. Of course it can. See below. > Each task belongs to exactly one cpuset, no exceptions. Sure. Same for irqs. > That's why you can't "treat irqs just like tasks". Sure you can. I was talking about running on the _cpus_ that belong to the "sets A and B but not C" and not that a task must belong to more than one cpuset. Unless I misinterpreted your example you were talking about exact same thing. In other words that an irq needs to assigned to the _cpus_ in the sets A and B but not C. Makes sense ? Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 19:11 ` Max Krasnyanskiy @ 2008-03-12 19:32 ` Paul Jackson 2008-03-12 20:08 ` Max Krasnyanskiy 0 siblings, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-12 19:32 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max wrote: > I was talking about running on the _cpus_ that belong to the "sets A and B but > not C" and not that a task must belong to more than one cpuset. This doesn't make sense to me. If a task is to run on the CPUs in both sets A and B, then it has to be in both those cpusets, which isn't allowed, or in some super set of both A and B (that is, in this example, in the top cpuset), which doesn't restrict the task to just A or B or their union. I have no idea what distinction you are seeing between what _cpus_ a task can run on, and what cpuset it belongs to. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 19:32 ` Paul Jackson @ 2008-03-12 20:08 ` Max Krasnyanskiy 2008-03-12 20:37 ` Paul Jackson 0 siblings, 1 reply; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-12 20:08 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: > Max wrote: >> I was talking about running on the _cpus_ that belong to the "sets A and B but >> not C" and not that a task must belong to more than one cpuset. > > This doesn't make sense to me. > > If a task is to run on the CPUs in both sets A and B, then it has to be > in both those cpusets, which isn't allowed, or in some super set of both > A and B (that is, in this example, in the top cpuset), which doesn't > restrict the task to just A or B or their union. > > I have no idea what distinction you are seeing between what _cpus_ a task > can run on, and what cpuset it belongs to. Paul, we are in 100% agreement here about the tasks. All I'm saying is that the same exact thing applies to the irqs. Again let me try your example. Suppose we have /dev/cpuset/A /dev/cpuset/B /dev/cpuset/C Now suppose that for whatever reason I must run task1 on the cpus that belong to sets A and B but not C. The only way to do that with cpusets is /dev/cpuset/X |-- A `-- B /dev/cpuset/C i.e. create parent cpuset X and assign task1 into cpuset X. Of course if A and B are not cpu_exclusive then X does not have to be their parent. Makes sense so far ? Now the same exact thing can be said about the irqs. If I need to assign irq1 to the cpus in sets A and B but not C I have to create set X that is the union of A and B, and assign irq1 to the set X. This is what I meant by "deeper hierarchies" in the earlier emails. Did I do a better job explaining this time :) ? Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 20:08 ` Max Krasnyanskiy @ 2008-03-12 20:37 ` Paul Jackson 2008-03-12 22:29 ` Max Krasnyanskiy 0 siblings, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-12 20:37 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel > This is what I meant by "deeper hierarchies" in the earlier emails. These deeper hierarchies create an incompatibility in some common uses of cpusets. When my example had cpusets A, B and C, that was as stated, not as might be modified to X, X/A, X/B and C. If the user has or would have setup cpusets A, B and C because that's what they needed to manage the CPU and Memory Node placement of their tasks, then that's what they might have setup, and there is a good chance that they would find the imposition of the extra 'X' cpuset to be a problem, to require more code and to be a cause of bugs. Adding irqs to the cpuset hierarchy isn't free; it can further overload the hierarchy, with "deeper hierarchies" as you state. If instead of deeper hierarchies, we allow the same irq to be listed in more than one cpuset (unlike tasks, which only get one cpuset) then we need some way, independent of the cpuset hierarchy, to determine how to resolve conflicts. We can't just add all the cpus together, allowing an irq to be directed to any CPU which is listed in any cpuset that accepts that irq, because a major use for this is to remove irqs from certain realtime CPUs. So ... if the natural hierarchy needed to map irqs to CPUs is not a subset of the natural hierarchy needed to map tasks to sets of CPUs and Nodes, then we either deepen the hierarchy (cross product of the the two maps, essentially) or we allow the same irq to be listed in multiple cpusets and provide some alternative mechanism, outside the hierarchy, to resolve the resulting conflicts in the irq to CPU map. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 20:37 ` Paul Jackson @ 2008-03-12 22:29 ` Max Krasnyanskiy 2008-03-12 23:30 ` Paul Jackson 2008-03-12 23:32 ` Paul Jackson 0 siblings, 2 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-12 22:29 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: >> This is what I meant by "deeper hierarchies" in the earlier emails. > > These deeper hierarchies create an incompatibility in some common uses > of cpusets. > > When my example had cpusets A, B and C, that was as stated, not as > might be modified to X, X/A, X/B and C. > > If the user has or would have setup cpusets A, B and C because that's > what they needed to manage the CPU and Memory Node placement of their > tasks, then that's what they might have setup, and there is a good > chance that they would find the imposition of the extra 'X' cpuset to > be a problem, to require more code and to be a cause of bugs. Isn't that just an issue of planing ? Those cpusets are not cast in stones are they. I mean yes users have setup A,B,C they way they did because that's what they needed. Now their plans/requirements have changed. They now want to also manage irqs via cpusets and in order to do that they need to replan/redo the partitioning. In order to manager irqs the code has to change anyway because currently there is not way to do that via cpuset. The users would have two options: 1. keep all irqs in the top set and manage them individually via /proc 2. layout cpusets differently btw I still do not see the "incompatibility" argument. Probably because I have no idea how the software you're talking about is designed. Are you saying that the software relies on a flat cpuset partitioning ? ie That it will brake if users add extra cpuset levels. > Adding irqs to the cpuset hierarchy isn't free; it can further overload > the hierarchy, with "deeper hierarchies" as you state. > > If instead of deeper hierarchies, we allow the same irq to be listed in > more than one cpuset (unlike tasks, which only get one cpuset) then we > need some way, independent of the cpuset hierarchy, to determine how to > resolve conflicts. We can't just add all the cpus together, allowing an > irq to be directed to any CPU which is listed in any cpuset that > accepts that irq, because a major use for this is to remove irqs from > certain realtime CPUs. This sounds like an overkill and as you pointed out is not even clear how it'd work. Looks like we have a trade-off here: 1. use simple "irq == pseudo-task" concept and potentially brake some existing software. We do have working solution. 2. come up with something that requires more complex irq management rules at the expense of complexity. We do not have working solution. My vote goes for #1 :). > So ... if the natural hierarchy needed to map irqs to CPUs is not a > subset of the natural hierarchy needed to map tasks to sets of CPUs and > Nodes, then we either deepen the hierarchy (cross product of the the > two maps, essentially) or we allow the same irq to be listed in > multiple cpusets and provide some alternative mechanism, outside the > hierarchy, to resolve the resulting conflicts in the irq to CPU map. I think by natural you mean "compatible with existing sw". What is unnatural in extra levels of cpusets ? If I read cgroup/cpuset documentation it seems to imply that nested cgroups/cpuset are allowed. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 22:29 ` Max Krasnyanskiy @ 2008-03-12 23:30 ` Paul Jackson 2008-03-13 0:57 ` Max Krasnyanskiy 2008-03-12 23:32 ` Paul Jackson 1 sibling, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-12 23:30 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max K wrote: > btw I still do not see the "incompatibility" argument. It's similar, perhaps, to what happens when we try to accomodate two architectures in one file system, with things like: /x86_64/bin /ia64/bin replacing the well known /bin. Things break. Apps such as the major batch schedulers (PBS and LSF) and various other tools and scripts buried here and there have come used to developing particular cpuset hierarchies over the last couple of years. Any time you force another dimension into such an existing hierarchy, things break, and people get annoyed. Sure ... the kernel doesn't care ... it can handle whatever hierarchy you like. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 23:30 ` Paul Jackson @ 2008-03-13 0:57 ` Max Krasnyanskiy 2008-03-13 7:03 ` Paul Jackson 2008-03-13 7:12 ` Paul Jackson 0 siblings, 2 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-13 0:57 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: > Max K wrote: >> btw I still do not see the "incompatibility" argument. > > It's similar, perhaps, to what happens when we try to accomodate two > architectures in one file system, with things like: > /x86_64/bin > /ia64/bin > replacing the well known /bin. > > Things break. Apps such as the major batch schedulers (PBS and LSF) > and various other tools and scripts buried here and there have come > used to developing particular cpuset hierarchies over the last couple > of years. > > Any time you force another dimension into such an existing hierarchy, > things break, and people get annoyed. > > Sure ... the kernel doesn't care ... it can handle whatever hierarchy > you like. Crazy idea. How about we add support for sym links to the cgroup fs ? It's still much cleaner imo than dealing with complex irq grouping schemes. In other words with symlinks we could do `-- cpuset |-- A -> X/A |-- B -> X/B |-- C `-- X |-- A `-- B The software that is used to the flat structure won't know the difference. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-13 0:57 ` Max Krasnyanskiy @ 2008-03-13 7:03 ` Paul Jackson 2008-04-10 18:03 ` Max Krasnyanskiy 2008-03-13 7:12 ` Paul Jackson 1 sibling, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-13 7:03 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max K wrote: > cleaner imo than dealing with complex irq grouping schemes. What's this "complex irq grouping scheme" that you're referring to? If it's what I posted last week, with named sets of irqs, and each cpuset naming which set it belonged to, that seems to me to actually fit the usage pattern rather well. The jobs running in particular cpusets need only know the 'name' of the set of irqs it makes sense to send to its CPUs (the realtime irqs, a particular piece of hardwares irqs, the ordinary system irqs, the absolute minimum set of irqs, ...) and the system admin gets to specify, one time, which irq numbers are in which named set, or to change, later on, which set a particular irq is in, all without having to have detailed knowledge of the jobs that want particular irq sets directed to their CPUs. We tend to label whatever makes sense to us as "simple", and whatever doesn't seem necessary in our experience, or doesn't make sense, as "complex". Such labels are losing their meaning these days, other than to help others figure out what we favor, or disfavor. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-13 7:03 ` Paul Jackson @ 2008-04-10 18:03 ` Max Krasnyanskiy 2008-04-14 18:39 ` Paul Jackson 2008-04-14 18:42 ` boot cgroup questions Paul Jackson 0 siblings, 2 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-04-10 18:03 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel The context here was that we were talking about a way to group irqs and assign them to the cpusets. I was proposing to just treat IRQs as tasks, and you were proposing to add some additional grouping. Replies inline below. Paul Jackson wrote: > Max K wrote: >> cleaner imo than dealing with complex irq grouping schemes. > > What's this "complex irq grouping scheme" that you're referring to? > > If it's what I posted last week, with named sets of irqs, and each > cpuset naming which set it belonged to, that seems to me to actually > fit the usage pattern rather well. I was just saying that cpuset already provides a nice grouping. After thinking about this some more I still do not see a need to group IRQs before assigning them to the cpusets. That's the complexity I was talking about. > The jobs running in particular cpusets need only know the 'name' of > the set of irqs it makes sense to send to its CPUs (the realtime > irqs, a particular piece of hardwares irqs, the ordinary system > irqs, the absolute minimum set of irqs, ...) and the system admin > gets to specify, one time, which irq numbers are in which named > set, or to change, later on, which set a particular irq is in, all > without having to have detailed knowledge of the jobs that want > particular irq sets directed to their CPUs. > > We tend to label whatever makes sense to us as "simple", and whatever > doesn't seem necessary in our experience, or doesn't make sense, as > "complex". > > Such labels are losing their meaning these days, other than to help > others figure out what we favor, or disfavor. I agree in general. In this particular case additional grouping introduces even more hierarchy. I seems to me that "irqN -> cpu1, cpu2, cpu3" is a very simple, straightforward relationship. Whereas "irqN -> groupX" "groupX -> cpu1" "groupX -> cpu2" "groupX -> cpu3" Is not that straightforward. Anyway. I think it all boils down to the compatibility with existing user-space apps. I still like the simple approach of treating irqs like tasks when it comes to assigning them to the cpusets. Which as we discussed earlier in some cases may require an extra level in the cpuset hierarchy. The question is, is that really such a big problem. If we make in kernel boot set optional, by default all irqs will be in the root cpuset. Which means people can still use /proc/irq/N/smp_affinity and manage irqs just like they do now. There is no compatibility issues in that case. So do you think the apps compatibility is an issue in that case ? Also isn't it likely that the apps will gradually adapt to handling multi-level cpusets anyway ? I mean you guys were talking about how wonderful and flexible cpusets are, but we cannot seem to use the flexibility because the apps are designed for a flat layout. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-04-10 18:03 ` Max Krasnyanskiy @ 2008-04-14 18:39 ` Paul Jackson 2008-05-09 10:45 ` Peter Zijlstra 2008-04-14 18:42 ` boot cgroup questions Paul Jackson 1 sibling, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-04-14 18:39 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max wrote: > I mean you guys were talking about how wonderful > and flexible cpusets are, but we cannot seem to use the flexibility because > the apps are designed for a flat layout No. Not flat. Not at all flat. We routinely and normally have an interesting hierarchy of cpusets below /dev/cpuset. However that hierarchy is determined by the nesting of subsets of the nodes (CPUs and/or Memory) on the system. These subsets of nodes in the /dev/cpuset hierarchy may well map nicely into the subsets of CPUs that can receive a particular set of IRQs, however that map is not bijective. Of particular interest here, it's not injective, meaning that multiple cpusets might and will commonly receive the same set of IRQs. You can force this map to be injective by elaborating the cpuset hierarchy to reflect both this new assignment of IRQs and the (CPU and/or Memory) node subset hiearchy that it currently reflects, but that will break code that was expecting the directory tree below /dev/cpuset to directly and only reflect the node hierarchy. In less mathematically obtuse wording, sure you can add more directory layers below /dev/cpuset, to handle IRQ assignments, but that will break code that was expecting the /dev/cpuset directory tree to only reflect the nesting of (CPU and/or Memory) nodes. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-04-14 18:39 ` Paul Jackson @ 2008-05-09 10:45 ` Peter Zijlstra 2008-05-09 11:17 ` IRQ affinities (was: boot cgroup questions) Paul Jackson 0 siblings, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2008-05-09 10:45 UTC (permalink / raw) To: Paul Jackson; +Cc: Max Krasnyanskiy, menage, mingo, linux-kernel On Mon, 2008-04-14 at 13:39 -0500, Paul Jackson wrote: > Max wrote: > > I mean you guys were talking about how wonderful > > and flexible cpusets are, but we cannot seem to use the flexibility because > > the apps are designed for a flat layout > > No. Not flat. Not at all flat. > > We routinely and normally have an interesting hierarchy of cpusets > below /dev/cpuset. However that hierarchy is determined by the > nesting of subsets of the nodes (CPUs and/or Memory) on the system. > > These subsets of nodes in the /dev/cpuset hierarchy may well map > nicely into the subsets of CPUs that can receive a particular set > of IRQs, however that map is not bijective. Of particular interest > here, it's not injective, meaning that multiple cpusets might and > will commonly receive the same set of IRQs. You can force this map > to be injective by elaborating the cpuset hierarchy to reflect both > this new assignment of IRQs and the (CPU and/or Memory) node subset > hiearchy that it currently reflects, but that will break code that > was expecting the directory tree below /dev/cpuset to directly and > only reflect the node hierarchy. > > In less mathematically obtuse wording, sure you can add more directory > layers below /dev/cpuset, to handle IRQ assignments, but that will > break code that was expecting the /dev/cpuset directory tree to only > reflect the nesting of (CPU and/or Memory) nodes. Sorry for being rather late to the game - other stuff keeps me from doing anything much here :-(. Anyway, the current applications don't support IRQ assingment anyway. That's a new feature; and its quite common that new features require code changes. So I'm not seeing the problem - don't change code and stuff works as before - change code and you get new stuff. So I'm arguing in favour of the IRQs as tasks idea that might need extra hierarchy levels. ^ permalink raw reply [flat|nested] 42+ messages in thread
* IRQ affinities (was: boot cgroup questions) 2008-05-09 10:45 ` Peter Zijlstra @ 2008-05-09 11:17 ` Paul Jackson 2008-05-09 11:48 ` Peter Zijlstra 2008-05-21 1:14 ` Max Krasnyanskiy 0 siblings, 2 replies; 42+ messages in thread From: Paul Jackson @ 2008-05-09 11:17 UTC (permalink / raw) To: Peter Zijlstra; +Cc: maxk, menage, mingo, linux-kernel Peter wrote: > That's a new feature; and its quite common that new features require > code changes. It's common for new features to require code changes to take advantage of the new features. It's less desirable that taking advantage of such new features breaks existing, basically unrelated, code. My gut sense is that, in a misguided effort to find a "simple" answer to irq distribution, we (well, y'all) are trying to attach this feature to cpusets or cgroups. Let me ask a different question: What solutions would you (Max, Peter, Ingo, lurkers, ...) be suggesting for this 'IRQ affinity' problem if cpusets and cgroups didn't exist in any form whatsoever? The answer to that question might help me contribute to this discussion in another way ... it might help me understand better what we're really trying to do here. You guys were proposing mechanisms that don't fit my architecture sense of cpusets, but I was having problems figuring out what are the essential underlying requirements, independent of choice of mechanism. Perhaps by describing one or two possible alternative, cpuset-free, mechanisms that come more or less close to meeting our needs, I will glean a better understanding of these elusive requirements, and can better contribute to the discussion of design trade offs facing us. So could you describe some possible cpuset-free solutions? If they are flawed in some critical way, that's ok, just point out said flaw(s). Either way, this could help illuminate what's needed here. It might be, once I better understand the requirements, possible solutions and their tradeoffs, that I come to agree that cpusets or cgroups present the best mechanism, given the tradeoffs and what's needed. Or it might be we find a better way to meet our needs. Actually, if for no other reason than to bring any lurkers up to speed, if you (Max or Peter, likely) wanted to describe, from the beginning, what this discussion is about, that would be good too. I doubt anyone outside of three or four of us even recalls that long discussion of February and March, 2008. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities (was: boot cgroup questions) 2008-05-09 11:17 ` IRQ affinities (was: boot cgroup questions) Paul Jackson @ 2008-05-09 11:48 ` Peter Zijlstra 2008-05-09 12:03 ` Paul Jackson 2008-05-21 1:14 ` Max Krasnyanskiy 1 sibling, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2008-05-09 11:48 UTC (permalink / raw) To: Paul Jackson; +Cc: maxk, menage, mingo, linux-kernel On Fri, 2008-05-09 at 06:17 -0500, Paul Jackson wrote: > Peter wrote: > > That's a new feature; and its quite common that new features require > > code changes. > > It's common for new features to require code changes to take advantage > of the new features. > > It's less desirable that taking advantage of such new features breaks > existing, basically unrelated, code. > > My gut sense is that, in a misguided effort to find a "simple" answer > to irq distribution, we (well, y'all) are trying to attach this > feature to cpusets or cgroups. > > Let me ask a different question: > > What solutions would you (Max, Peter, Ingo, lurkers, ...) be > suggesting for this 'IRQ affinity' problem if cpusets and > cgroups didn't exist in any form whatsoever? > > The answer to that question might help me contribute to this discussion > in another way ... it might help me understand better what we're really > trying to do here. You guys were proposing mechanisms that don't fit > my architecture sense of cpusets, but I was having problems figuring out > what are the essential underlying requirements, independent of choice > of mechanism. > > Perhaps by describing one or two possible alternative, cpuset-free, > mechanisms that come more or less close to meeting our needs, I will > glean a better understanding of these elusive requirements, and can > better contribute to the discussion of design trade offs facing us. > > So could you describe some possible cpuset-free solutions? If they are > flawed in some critical way, that's ok, just point out said flaw(s). > Either way, this could help illuminate what's needed here. > > It might be, once I better understand the requirements, possible > solutions and their tradeoffs, that I come to agree that cpusets or > cgroups present the best mechanism, given the tradeoffs and what's > needed. Or it might be we find a better way to meet our needs. > > Actually, if for no other reason than to bring any lurkers up to speed, > if you (Max or Peter, likely) wanted to describe, from the beginning, > what this discussion is about, that would be good too. I doubt anyone > outside of three or four of us even recalls that long discussion of > February and March, 2008. I see two use-cases: - Isolation - NUMA node devices With isolation you want to move all of you 'normal' system tasks off to side of your machine and use the other side for 'special - rt' tasks. For IRQs this means that you want to move all the 'normal' IRQs along with the 'normal' tasks, and move the special IRQs into the rt side. Of course you can do this by setting IRQ affinities one by one, but being able to group the IRQs seems a sensible thing to me. One thing here is that we'd like to also provide a default group for new IRQs, so that when a new device appears its not allowed into the 'special' side of your machine. This is what Max focussed on, and provides a binary devision of your machine: special and not special. Now I was thinking that if we generalize this whole thing it might be useful for other purposes such as IRQ placement near the nodes that host the device and/or the application using them. So what we'd end up with is named affinity groups that contain (unique) IRQs. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities (was: boot cgroup questions) 2008-05-09 11:48 ` Peter Zijlstra @ 2008-05-09 12:03 ` Paul Jackson 2008-05-09 12:14 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-05-09 12:03 UTC (permalink / raw) To: Peter Zijlstra; +Cc: maxk, menage, mingo, linux-kernel Peter wrote: > I see two use-cases: > > - Isolation > - NUMA node devices Ok ... so let me propose an entirely different solution. No doubt it has some terrible flaw, but I'll just have to await your replies to see what that is. How about we have: 1) Yet another text config file in /etc, this one containing lines having two fields: * a list of IRQs, and * a cpumask. This file would specify which CPUs should handle which IRQs. 2) A utility that can be run, after changing the above file, to poke the proper cpumask to each IRQ, as specified in the file. (Obligatory "simple" marketing claim: the above requires no kernel changes.) What am I missing? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities (was: boot cgroup questions) 2008-05-09 12:03 ` Paul Jackson @ 2008-05-09 12:14 ` Peter Zijlstra 2008-05-09 12:36 ` Paul Jackson 0 siblings, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2008-05-09 12:14 UTC (permalink / raw) To: Paul Jackson; +Cc: maxk, menage, mingo, linux-kernel On Fri, 2008-05-09 at 07:03 -0500, Paul Jackson wrote: > Peter wrote: > > I see two use-cases: > > > > - Isolation > > - NUMA node devices > > Ok ... so let me propose an entirely different solution. > > No doubt it has some terrible flaw, but I'll just have to > await your replies to see what that is. > > How about we have: > > 1) Yet another text config file in /etc, this one containing > lines having two fields: > * a list of IRQs, and > * a cpumask. > This file would specify which CPUs should handle which IRQs. > > 2) A utility that can be run, after changing the above file, > to poke the proper cpumask to each IRQ, as specified in > the file. > > (Obligatory "simple" marketing claim: the above requires no > kernel changes.) > > What am I missing? Two points: - we can't currently set irq affinities for non-existent (aka new) IRQs - its a shame to duplicate the masks - most of this information would also be used in the cpuset structure used to place the tasks. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities (was: boot cgroup questions) 2008-05-09 12:14 ` Peter Zijlstra @ 2008-05-09 12:36 ` Paul Jackson 2008-05-09 17:43 ` Paul Jackson 2008-05-21 1:21 ` IRQ affinities Max Krasnyanskiy 0 siblings, 2 replies; 42+ messages in thread From: Paul Jackson @ 2008-05-09 12:36 UTC (permalink / raw) To: Peter Zijlstra; +Cc: maxk, menage, mingo, linux-kernel Peter, responding to pj: > > What am I missing? > > Two points: > > - we can't currently set irq affinities for non-existent (aka new) IRQs > - its a shame to duplicate the masks - most of this information would > also be used in the cpuset structure used to place the tasks. Ok. Let me twist this a turn tighter then. The first of your two points, a default affinitiy mask for new irqs, would seem to require a kernel change. But that change could be a single cpumask, settable in /sys somewhere, specifying the default affinity. If that's all we needed, it would be easy. The second of your two points, "duplicating masks", seems more delicate. The space of named cpusets (the directory pathnames below the usual mount point, /dev/cpuset) is not really much more compact than the set of interesting cpumasks. But I suppose your point is that some of the -particular- cpumasks already named by the cpuset hierarchy are tantilizingly close to the set of interesting cpumasks needed for irq affinity ... close given some combination of union, intersection, set difference and compliment operations, given my usual bias toward looking at such things as this using set theory mechanisms. That is, for example, one might want all the CPUs in cpusets foo, bar and baz, except the CPUs in cpuset blip, to handle IRQs so and so. Let me think on that ... it's my nap time now. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities (was: boot cgroup questions) 2008-05-09 12:36 ` Paul Jackson @ 2008-05-09 17:43 ` Paul Jackson 2008-05-21 1:21 ` IRQ affinities Max Krasnyanskiy 1 sibling, 0 replies; 42+ messages in thread From: Paul Jackson @ 2008-05-09 17:43 UTC (permalink / raw) To: Paul Jackson; +Cc: a.p.zijlstra, maxk, menage, mingo, linux-kernel pj, talking to himself: > That is, for example, one might want all the CPUs in cpusets > foo, bar and baz, except the CPUs in cpuset blip, to handle > IRQs so and so. Ahh! Perhaps that example has the keys to this kingdom. How about this. We add two files to each cpuset: irq_affinity_include # IRQs to direct to CPUs in this cpuset irq_affinity_exclude # IRQs -not- to direct to these CPUs where irq_affinity_exclude overrides irq_affinity_include. So, to determine to which CPUs a given interrupt (IRQ) can be directed: 1) Combine (union) the 'cpus' of all the cpusets for which that IRQ is in that cpusets irq_affinity_include, then 2) Remove (set substraction) the 'cpus' of any cpuset for which that IRQ is in that cpusets irq_affinity_exclude. In the simplest case of just wanting to isolate some CPUs with their own special list of interrupts, one would: 1) include all interrupts in the top cpusets irq_affinity_include, and 2) include the interrupts you don't want in the isolated cpusets irq_affinity_exclude. Observe that there is no dependency on the cpuset hierarchy in the above. The contents of the files irq_affinity_include and irq_affinity_exclude would be inherited by child cpusets on creation from their parents. The one detail that puzzles me at the moment is what ownership and permissions these two irq_affinity_* files would have. I am concerned that the usual permissions, which allow a job to write its own cpuset files would allow a job to affect the overall system to a greater degree than is desired. Perhaps an additional inheritance rule would be useful and appropriate, such as a rule that a given cpusets irq_affinity_include must be a subset of its parents or a rule that a given cpusets irq_affinity_exclude must be a -superset- of its parents; I'm unsure here. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities 2008-05-09 12:36 ` Paul Jackson 2008-05-09 17:43 ` Paul Jackson @ 2008-05-21 1:21 ` Max Krasnyanskiy 1 sibling, 0 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-05-21 1:21 UTC (permalink / raw) To: Paul Jackson; +Cc: Peter Zijlstra, menage, mingo, linux-kernel Paul Jackson wrote: > Peter, responding to pj: >>> What am I missing? >> Two points: >> >> - we can't currently set irq affinities for non-existent (aka new) IRQs >> - its a shame to duplicate the masks - most of this information would >> also be used in the cpuset structure used to place the tasks. > > Ok. Let me twist this a turn tighter then. > > The first of your two points, a default affinitiy mask for new irqs, > would seem to require a kernel change. But that change could be a > single cpumask, settable in /sys somewhere, specifying the default > affinity. If that's all we needed, it would be easy. Looks like we arrived at the same conclusion. See my prev reply. I'm in the process of making a patch for exposing default affinity mask. > The second of your two points, "duplicating masks", seems more delicate. There is actually no duplication as far as I can see because IRQ layer already has the default_mask variable. It just needs to be exposed via /proc or /sys. > The space of named cpusets (the directory pathnames below the usual > mount point, /dev/cpuset) is not really much more compact than the > set of interesting cpumasks. But I suppose your point is that some > of the -particular- cpumasks already named by the cpuset hierarchy > are tantilizingly close to the set of interesting cpumasks needed for > irq affinity ... close given some combination of union, intersection, > set difference and compliment operations, given my usual bias toward > looking at such things as this using set theory mechanisms. That is, > for example, one might want all the CPUs in cpusets foo, bar and baz, > except the CPUs in cpuset blip, to handle IRQs so and so. > > Let me think on that ... it's my nap time now. This would be an overkill imho. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities 2008-05-09 11:17 ` IRQ affinities (was: boot cgroup questions) Paul Jackson 2008-05-09 11:48 ` Peter Zijlstra @ 2008-05-21 1:14 ` Max Krasnyanskiy 2008-05-21 4:45 ` Arjan van de Ven 2008-05-21 6:34 ` Paul Jackson 1 sibling, 2 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-05-21 1:14 UTC (permalink / raw) To: Paul Jackson; +Cc: Peter Zijlstra, menage, mingo, linux-kernel Paul Jackson wrote: > Peter wrote: >> That's a new feature; and its quite common that new features require >> code changes. > > It's common for new features to require code changes to take advantage > of the new features. > > It's less desirable that taking advantage of such new features breaks > existing, basically unrelated, code. > > My gut sense is that, in a misguided effort to find a "simple" answer > to irq distribution, we (well, y'all) are trying to attach this > feature to cpusets or cgroups. > > Let me ask a different question: > > What solutions would you (Max, Peter, Ingo, lurkers, ...) be > suggesting for this 'IRQ affinity' problem if cpusets and > cgroups didn't exist in any form whatsoever? As Peter explained I'm focusing on the "CPU isolation" aspect. ie Shielding a CPU (or a set of CPUs) from various kernel activities (load balancing, soft and hard irq handling, workqueues, etc). For the IRQs specifically all I need is to be able to tell the kernel to not route IRQs to certain CPUs. That's mostly works already via /proc/irq/N/smp_affinity, the problem is dynamically allocated irqs because /proc/irq/N directory does not exist until those IRQs are allocated/enabled. Originally I introduced global cpu_isolated_map. IRQ code was using that map to exclude CPU(s) from IRQ routing. What I realized now is that all I need is /proc/irq/default_smp_affinity. In other words I just need to export default mask used by the IRQ layer. I think this makes sense regardless of what cpuset based solution we'll come up with. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities 2008-05-21 1:14 ` Max Krasnyanskiy @ 2008-05-21 4:45 ` Arjan van de Ven 2008-05-21 16:18 ` Max Krasnyanskiy 2008-05-21 6:34 ` Paul Jackson 1 sibling, 1 reply; 42+ messages in thread From: Arjan van de Ven @ 2008-05-21 4:45 UTC (permalink / raw) To: Max Krasnyanskiy Cc: Paul Jackson, Peter Zijlstra, menage, mingo, linux-kernel On Tue, 20 May 2008 18:14:58 -0700 Max Krasnyanskiy <maxk@qualcomm.co > > For the IRQs specifically all I need is to be able to tell the kernel > to not route IRQs to certain CPUs. That's mostly works already via > /proc/irq/N/smp_affinity, the problem is dynamically allocated irqs > because /proc/irq/N directory does not exist until those IRQs are > allocated/enabled. \\ why don't you tell irqbalance instead? it'll make sure the irq stays out of the wind... ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities 2008-05-21 4:45 ` Arjan van de Ven @ 2008-05-21 16:18 ` Max Krasnyanskiy 0 siblings, 0 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-05-21 16:18 UTC (permalink / raw) To: Arjan van de Ven Cc: Paul Jackson, Peter Zijlstra, menage, mingo, linux-kernel Arjan van de Ven wrote: > On Tue, 20 May 2008 18:14:58 -0700 > Max Krasnyanskiy <maxk@qualcomm.co > >> For the IRQs specifically all I need is to be able to tell the kernel >> to not route IRQs to certain CPUs. That's mostly works already via >> /proc/irq/N/smp_affinity, the problem is dynamically allocated irqs >> because /proc/irq/N directory does not exist until those IRQs are >> allocated/enabled. > \\ > > why don't you tell irqbalance instead? it'll make sure the irq stays > out of the wind... > That will be too late. By the time irqbalance sees that IRQ it may have already fired (possibly several times) on the "wrong" processor. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities 2008-05-21 1:14 ` Max Krasnyanskiy 2008-05-21 4:45 ` Arjan van de Ven @ 2008-05-21 6:34 ` Paul Jackson 2008-05-21 17:58 ` Max Krasnyanskiy 1 sibling, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-05-21 6:34 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: a.p.zijlstra, menage, mingo, linux-kernel Max wrote: > What I realized now is that all I need is > /proc/irq/default_smp_affinity. I suspect that something like you're proposing to do here will answer your needs, to "tell the kernel to not route IRQs to certain CPUs." I suspect that other folks will have some additional needs, that perhaps my idea of May 9, 2008: How about this. We add two files to each cpuset: irq_affinity_include # IRQs to direct to CPUs in this cpuset irq_affinity_exclude # IRQs -not- to direct to these CPUs where irq_affinity_exclude overrides irq_affinity_include. could meet. It makes sense to me to deal with your "default_smp_affinity" patch first, and then come back around and see what remains to be done, and how to do it, perhaps with additional cpuset based mechanisms such as the above two irq_affinity_* IRQ masks. > I'm in the process of making a patch for exposing default affinity mask. Peter, et al: how does Max's planned "default_smp_affinity" patch sound to you, as the next step we take on this? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: IRQ affinities 2008-05-21 6:34 ` Paul Jackson @ 2008-05-21 17:58 ` Max Krasnyanskiy 0 siblings, 0 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-05-21 17:58 UTC (permalink / raw) To: Paul Jackson; +Cc: a.p.zijlstra, menage, mingo, linux-kernel Hi Paul, Paul Jackson wrote: > Max wrote: >> What I realized now is that all I need is >> /proc/irq/default_smp_affinity. > > I suspect that something like you're proposing to do here will answer > your needs, to "tell the kernel to not route IRQs to certain CPUs." > > I suspect that other folks will have some additional needs, that perhaps > my idea of May 9, 2008: > > How about this. We add two files to each cpuset: > > irq_affinity_include # IRQs to direct to CPUs in this cpuset > irq_affinity_exclude # IRQs -not- to direct to these CPUs > > where irq_affinity_exclude overrides irq_affinity_include. > > could meet. I saw your earlier email with that proposal. Just had to digest it a bit :) (still catching up with things after vacation). > So, to determine to which CPUs a given interrupt (IRQ) can be directed: > 1) Combine (union) the 'cpus' of all the cpusets for which > that IRQ is in that cpusets irq_affinity_include, then > 2) Remove (set substraction) the 'cpus' of any cpuset for which > that IRQ is in that cpusets irq_affinity_exclude. That would work. But wouldn't it be hard for the users to debug things ? I mean if you have a complex cpuset hierarchy it may be hard to figure out why a certain irq is not getting to cpuX and vice versa. btw How would we represent "all irqs", are you implying that those files contain masks ? We'll also need to handle conflicts like "irq excluded from all cpusets", etc. I still prefer "irq as a task" approach. It's very simple and straightforward mapping of an irq -> cpuset, no conflicts, etc. Easy to figure out for the user where an irq will end up. btw I did not quite get the idea behind the "exclude" part. Why is "include" not enough ? Can you give me an example. > It makes sense to me to deal with your "default_smp_affinity" patch > first, and then come back around and see what remains to be done, and > how to do it, perhaps with additional cpuset based mechanisms such as > the above two irq_affinity_* IRQ masks. > >> I'm in the process of making a patch for exposing default affinity mask. > > Peter, et al: how does Max's planned "default_smp_affinity" patch sound > to you, as the next step we take on this? I think it makes sense regardless of the cpuset based approach. Seems like a logical extension of the existing interface (ie per IRQ mask plus the default). Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-04-10 18:03 ` Max Krasnyanskiy 2008-04-14 18:39 ` Paul Jackson @ 2008-04-14 18:42 ` Paul Jackson 1 sibling, 0 replies; 42+ messages in thread From: Paul Jackson @ 2008-04-14 18:42 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max K wrote: > I agree in general. In this particular case additional grouping introduces > even more hierarchy. I seems to me that > "irqN -> cpu1, cpu2, cpu3" > is a very simple, straightforward relationship. Whereas > "irqN -> groupX" > "groupX -> cpu1" > "groupX -> cpu2" > "groupX -> cpu3" > Is not that straightforward. Clearly, yes, the first is simpler than the second. The question is which is correct. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-13 0:57 ` Max Krasnyanskiy 2008-03-13 7:03 ` Paul Jackson @ 2008-03-13 7:12 ` Paul Jackson 2008-04-10 17:24 ` Max Krasnyanskiy 1 sibling, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-13 7:12 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel > How about we add support for sym links to the cgroup fs ? Still pollutes the primary cpuset name space ... you have all the directories X, X/A, and X/B as well as the symlinks A and B. Symlinks allow for one path that needs to be 'aliased' to another, but they are a one-way map; without an exhaustive search of the potential namespace, one can't invert them, or determine if they can't be inverted. Tools have to constantly make heuristic decisions whether to default to dereferencing the symlink, or not, and often have to provide alternatives for the non-default choice. They are a pain in the backside even if designed in and expected up front. If added as critical structure after the fact, something breaks, pretty much for sure. For one minor example, code I've probably buried someplace that does "find /dev/cpuset -type d" to find all cpusets would break. Or the one-line /sbin/cpuset_release_agent script: rmdir /dev/cpuset/$1 is broken -- fails to clean-up associated symlinks, and can't avoid race conditions if it tries to add code to do that. > Crazy idea. Agreed ;) But nice picture ;). -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-13 7:12 ` Paul Jackson @ 2008-04-10 17:24 ` Max Krasnyanskiy 2008-04-10 17:37 ` Paul Jackson 0 siblings, 1 reply; 42+ messages in thread From: Max Krasnyanskiy @ 2008-04-10 17:24 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Sorry for disappearing on you guys. I'm working on releasing the user-space framework and engine that uses cpu isolation for hard-RT. Once that's done I'm going to resurrect these efforts. In the mean time let me reply to your last comments. Paul Jackson wrote: >> How about we add support for sym links to the cgroup fs ? > > Still pollutes the primary cpuset name space ... you have all > the directories X, X/A, and X/B as well as the symlinks A and B. > > Symlinks allow for one path that needs to be 'aliased' to another, > but they are a one-way map; without an exhaustive search of the > potential namespace, one can't invert them, or determine if they > can't be inverted. > > Tools have to constantly make heuristic decisions whether to > default to dereferencing the symlink, or not, and often have to > provide alternatives for the non-default choice. > > They are a pain in the backside even if designed in and expected > up front. > > If added as critical structure after the fact, something breaks, > pretty much for sure. > > For one minor example, code I've probably buried someplace that > does "find /dev/cpuset -type d" to find all cpusets would break. > > Or the one-line /sbin/cpuset_release_agent script: > rmdir /dev/cpuset/$1 > is broken -- fails to clean-up associated symlinks, and can't > avoid race conditions if it tries to add code to do that. > >> Crazy idea. > > Agreed ;) Got it. Symlinks are out :) Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-04-10 17:24 ` Max Krasnyanskiy @ 2008-04-10 17:37 ` Paul Jackson 0 siblings, 0 replies; 42+ messages in thread From: Paul Jackson @ 2008-04-10 17:37 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max K wrote: > > Agreed ;) > > Got it. Symlinks are out :) Good ;). -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 22:29 ` Max Krasnyanskiy 2008-03-12 23:30 ` Paul Jackson @ 2008-03-12 23:32 ` Paul Jackson 2008-03-13 0:46 ` Max Krasnyanskiy 1 sibling, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-12 23:32 UTC (permalink / raw) To: Max Krasnyanskiy; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Max K wrote: > 1. use simple "irq == pseudo-task" concept and potentially brake some existing > software. We do have working solution. Breaking existing software is not what I call working. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 23:32 ` Paul Jackson @ 2008-03-13 0:46 ` Max Krasnyanskiy 0 siblings, 0 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-13 0:46 UTC (permalink / raw) To: Paul Jackson; +Cc: menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: > Max K wrote: >> 1. use simple "irq == pseudo-task" concept and potentially brake some existing >> software. We do have working solution. > > Breaking existing software is not what I call working. > Ok ok I get it :) You know what I meant though. In the other scheme it's not even clear how it'd work in general. Max ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 4:59 ` Paul Jackson 2008-03-12 18:24 ` Max Krasnyanskiy @ 2008-03-12 19:16 ` Paul Menage 2008-03-12 19:24 ` Paul Jackson 1 sibling, 1 reply; 42+ messages in thread From: Paul Menage @ 2008-03-12 19:16 UTC (permalink / raw) To: Paul Jackson; +Cc: maxk, mingo, a.p.zijlstra, linux-kernel On Tue, Mar 11, 2008 at 9:59 PM, Paul Jackson <pj@sgi.com> wrote: > > Paul M -- could we take a different tack here -- extend cgroups to map > -either- tasks or irqs to the managed resources? Not cgroups, no. If you really wanted to extend cpusets specifically to allow irqs to be assigned to a cpuset to control which cpus they could execute on, then that might be a possibility. But I don't see how this would be useful for any other cgroup subsystem, so it doesn't belong in the generic framework. My feeling is that just using a simple bitmask assignment, unrelated to cpusets or cgroups, as Max suggested in his later email is the way to go. Paul ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 19:16 ` Paul Menage @ 2008-03-12 19:24 ` Paul Jackson 2008-03-12 19:30 ` Max Krasnyanskiy 0 siblings, 1 reply; 42+ messages in thread From: Paul Jackson @ 2008-03-12 19:24 UTC (permalink / raw) To: Paul Menage; +Cc: maxk, mingo, a.p.zijlstra, linux-kernel Paul M wrote: > Not cgroups, no. If you really wanted to extend cpusets specifically > to allow irqs to be assigned to a cpuset to control which cpus they > could execute on, then that might be a possibility. But I don't see > how this would be useful for any other cgroup subsystem, so it doesn't > belong in the generic framework. Ok - a sensible decision. > My feeling is that just using a simple bitmask assignment, unrelated > to cpusets or cgroups, as Max suggested in his later email is the way > to go. I'll have to have another go at reading his replies. I seem to have more difficulty making sense of his posts ... not sure why. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: boot cgroup questions 2008-03-12 19:24 ` Paul Jackson @ 2008-03-12 19:30 ` Max Krasnyanskiy 0 siblings, 0 replies; 42+ messages in thread From: Max Krasnyanskiy @ 2008-03-12 19:30 UTC (permalink / raw) To: Paul Jackson; +Cc: Paul Menage, mingo, a.p.zijlstra, linux-kernel Paul Jackson wrote: > Paul M wrote: >> Not cgroups, no. If you really wanted to extend cpusets specifically >> to allow irqs to be assigned to a cpuset to control which cpus they >> could execute on, then that might be a possibility. But I don't see >> how this would be useful for any other cgroup subsystem, so it doesn't >> belong in the generic framework. > > Ok - a sensible decision. > >> My feeling is that just using a simple bitmask assignment, unrelated >> to cpusets or cgroups, as Max suggested in his later email is the way >> to go. > > I'll have to have another go at reading his replies. I seem to have > more difficulty making sense of his posts ... not sure why. I'm sure it's because of gazillion typos in them :). Max ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2008-05-21 17:58 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-12 1:23 boot cgroup questions Max Krasnyansky 2008-03-12 1:27 ` Paul Menage 2008-03-12 2:34 ` Max Krasnyansky 2008-03-12 2:36 ` Paul Menage 2008-03-12 2:53 ` Max Krasnyansky 2008-03-12 3:09 ` Paul Menage 2008-03-12 3:39 ` Max Krasnyansky 2008-03-12 4:59 ` Paul Jackson 2008-03-12 18:24 ` Max Krasnyanskiy 2008-03-12 18:57 ` Paul Jackson 2008-03-12 19:11 ` Max Krasnyanskiy 2008-03-12 19:32 ` Paul Jackson 2008-03-12 20:08 ` Max Krasnyanskiy 2008-03-12 20:37 ` Paul Jackson 2008-03-12 22:29 ` Max Krasnyanskiy 2008-03-12 23:30 ` Paul Jackson 2008-03-13 0:57 ` Max Krasnyanskiy 2008-03-13 7:03 ` Paul Jackson 2008-04-10 18:03 ` Max Krasnyanskiy 2008-04-14 18:39 ` Paul Jackson 2008-05-09 10:45 ` Peter Zijlstra 2008-05-09 11:17 ` IRQ affinities (was: boot cgroup questions) Paul Jackson 2008-05-09 11:48 ` Peter Zijlstra 2008-05-09 12:03 ` Paul Jackson 2008-05-09 12:14 ` Peter Zijlstra 2008-05-09 12:36 ` Paul Jackson 2008-05-09 17:43 ` Paul Jackson 2008-05-21 1:21 ` IRQ affinities Max Krasnyanskiy 2008-05-21 1:14 ` Max Krasnyanskiy 2008-05-21 4:45 ` Arjan van de Ven 2008-05-21 16:18 ` Max Krasnyanskiy 2008-05-21 6:34 ` Paul Jackson 2008-05-21 17:58 ` Max Krasnyanskiy 2008-04-14 18:42 ` boot cgroup questions Paul Jackson 2008-03-13 7:12 ` Paul Jackson 2008-04-10 17:24 ` Max Krasnyanskiy 2008-04-10 17:37 ` Paul Jackson 2008-03-12 23:32 ` Paul Jackson 2008-03-13 0:46 ` Max Krasnyanskiy 2008-03-12 19:16 ` Paul Menage 2008-03-12 19:24 ` Paul Jackson 2008-03-12 19:30 ` Max Krasnyanskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox