* cpuset attach_task to touch per-cpu kernel threads? @ 2007-06-21 1:49 Srivatsa Vaddagiri 2007-06-21 2:35 ` Paul Jackson 2007-06-21 12:16 ` Ingo Molnar 0 siblings, 2 replies; 8+ messages in thread From: Srivatsa Vaddagiri @ 2007-06-21 1:49 UTC (permalink / raw) To: pj; +Cc: clameter, Ingo Molnar, linux-kernel, Dinakar Guniguntala, akpm Paul, You had once revealed a cute one-line command to move all tasks from one cpuset to another [1], which was: # move all tasks from top cpuset to 'foo' cpuset sed -nu p < /dev/cpuset/tasks > /dev/cpuset/foo/tasks I somewhat regret now having fallen for it and using it in my scripts. To my agony, I found that it moves per-cpu kernel threads too, forcibly breaking their affinity. In my case, rq->migration thread (kernel/sched.c) was moved off cpu3 and started running on cpu2, which caused nasty problems for me. I am sure this can lead to problems for other per-cpu kernel threads, if their assumption of per-cpu'ness is broken this way. One could argue that 'root' user did this and nothing wrong in assuming he knows what he is doing. But I am wondering if attach_task() should leave kernel threads alone and act only upon user-space threads. Or maybe allow movement if it doesn't result in changing kernel-threads's cpu affinity. Do you have anything to say regarding this? Fyi, this was what I was doing (as root): #!/bin/bash mount -t container -o cpuset none /dev/cpuset cd /dev/cpuset mkdir sys # create a cpuset to move all tasks into mkdir test # test cpuset in which my tests will run # Assign cpus to both cpusets cd sys; echo 0-2 > cpus; echo 0 > mems; echo 1 > cpu_exclusive; cd .. cd test; echo 3 > cpus; echo 0 > mems; echo 1 > cpu_exclusive; cd .. # Move all tasks to 'sys' cpuset so that cpu3 is dedicated to # only my chosen tasks sed -nu p < /dev/cpuset/tasks > /dev/cpuset/tasks echo $$ > test/tasks /path_to/test_prg References: 1. http://marc.info/?l=linux-kernel&m=115627306628524 -- Regards, vatsa ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-21 1:49 cpuset attach_task to touch per-cpu kernel threads? Srivatsa Vaddagiri @ 2007-06-21 2:35 ` Paul Jackson 2007-06-21 12:16 ` Ingo Molnar 1 sibling, 0 replies; 8+ messages in thread From: Paul Jackson @ 2007-06-21 2:35 UTC (permalink / raw) To: vatsa; +Cc: clameter, mingo, linux-kernel, dino, akpm Srivatsa wrote: # move all tasks from top cpuset to 'foo' cpuset sed -nu p < /dev/cpuset/tasks > /dev/cpuset/foo/tasks Aha - that won't work very well, as you noticed. In looking through my past email archives, I can see where I have recommended this trick to move things -into- the top cpuset, but not to move them -out-of- the top cpuset. I would not (except by mistake) recommend using the above trick to move tasks out of the top cpuset, for the very reason you note. I do have code to move tasks out of the top cpuset in bulk, but it is written in C, as part of some SGI proprietary product (though it relies mostly on some LGPL licensed bitmask and cpuset libraries that I have written for SGI.) The basic idea is to avoid moving the threads whose 'Cpus_allowed' value in their /proc/<pid>/status file is a strict subset of (less than, but -not- equal to) the set of online cpus. The kernel threads that you don't want to move are those that are pinned to specific cpus or nodes; they are where they must be for their purposes. Kernel threads that are not pinned can be moved just as readily as user threads; they just need to be someplace. I don't know any easy way to script this. We just don't have the tools in shell script to conveniently work with sets or bit masks. ... not yet anyway ... see below. Checking whether a tasks Cpus_allowed is a strict subset of the set of online cpus may not always be the check needed, depending on what one is doing. But it matched my needs, and from the looks of what you're doing in that script, making the sys and test cpusets, it might well match your needs as well, as your needs look rather similar to what mine were. > But I am wondering if attach_task() should leave kernel threads alone and > act only upon user-space threads. Or maybe allow movement if it doesn't > result in changing kernel-threads's cpu affinity. I tend to favor minimizing kernel support, where user level support is sufficient. This was especially important in the early years of Linux 2.6 cpuset kernel work, when it was vital for its acceptance to minimize the kernel footprint of cpusets as much as I was able. I would tend to favor encouraging further development of the user level support for such work, over special cases in the kernel calls just because we had not yet provided the user code to conveniently make a certain test, but could well enough do so if need be. Hopefully, in a few months, when I got off this other, non-cpuset related, task that I'm on, I will get some time to publish the user level LGPL licensed library code that makes working with bitmasks and cpusets convenient in user level C code. The code is in excellent shape, if I do say so myself. The next step, which I have fond dreams of getting to, but which is so low on my managers priority list that there is nearly zero chance of it happening, would be to provide a scriptable interface to the bitmask code. I'd probably do that as a Python module, integrating it with Python sets. This would take little more than some routines to import and export Python sets to the couple of commonly used ascii representations of bitmasks, which are those formed by the kernels lib/bitmap.c bitmap_scnprintf() and bitmap_scnlistprintf() routines. Such work is sufficiently generic that it might even be acceptable as a patch to the mainline Python release. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.925.600.0401 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-21 1:49 cpuset attach_task to touch per-cpu kernel threads? Srivatsa Vaddagiri 2007-06-21 2:35 ` Paul Jackson @ 2007-06-21 12:16 ` Ingo Molnar 2007-06-21 17:07 ` Paul Jackson 1 sibling, 1 reply; 8+ messages in thread From: Ingo Molnar @ 2007-06-21 12:16 UTC (permalink / raw) To: Srivatsa Vaddagiri; +Cc: pj, clameter, linux-kernel, Dinakar Guniguntala, akpm * Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> wrote: > But I am wondering if attach_task() should leave kernel threads alone > and act only upon user-space threads. Or maybe allow movement if it > doesn't result in changing kernel-threads's cpu affinity. yeah, i'd agree with the latter. We could also special-case the migration thread to never be migrated itself. (although that's not the end of the matter either - two ksoftirqds on a cpu are not healty) Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-21 12:16 ` Ingo Molnar @ 2007-06-21 17:07 ` Paul Jackson 2007-06-21 17:32 ` Srivatsa Vaddagiri 0 siblings, 1 reply; 8+ messages in thread From: Paul Jackson @ 2007-06-21 17:07 UTC (permalink / raw) To: Ingo Molnar; +Cc: vatsa, clameter, linux-kernel, dino, akpm Ingo, responding to Srivatsa: > > Or maybe allow movement if it > > doesn't result in changing kernel-threads's cpu affinity. > > yeah, i'd agree .. Good point. I'd agree too. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.925.600.0401 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-21 17:07 ` Paul Jackson @ 2007-06-21 17:32 ` Srivatsa Vaddagiri 2007-06-21 17:51 ` Paul Jackson 0 siblings, 1 reply; 8+ messages in thread From: Srivatsa Vaddagiri @ 2007-06-21 17:32 UTC (permalink / raw) To: Paul Jackson; +Cc: Ingo Molnar, clameter, linux-kernel, dino, akpm On Thu, Jun 21, 2007 at 10:07:12AM -0700, Paul Jackson wrote: > Ingo, responding to Srivatsa: > > > Or maybe allow movement if it > > > doesn't result in changing kernel-threads's cpu affinity. > > > > yeah, i'd agree .. > > Good point. I'd agree too. Yeah .."allow movement if it doesn't result in changing kernel-threads's cpu affinity" sounds good, except it is hard to implement in cpuset's context I think. For ex: we now have to take additional steps when changing 'cpus_allowed' of a cpuset such that it doesn't violate any cpu affinity of kernel threads bound to the cpuset. That itself makes the implementation complex I think. How about a simpler patch which bans movement of kernel threads from its home cpuset (i.e top cpuset)? Index: current/kernel/cpuset.c =================================================================== --- current.orig/kernel/cpuset.c 2007-06-21 19:42:18.000000000 +0530 +++ current/kernel/cpuset.c 2007-06-21 22:24:38.000000000 +0530 @@ -881,6 +881,10 @@ if (cpus_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)) return -ENOSPC; + /* Don't allow kernel threads to be moved */ + if (!tsk->mm) + return -EINVAL; + return security_task_setscheduler(tsk, 0, NULL); } This probably catches exiting user-space tasks also (whose ->mm pointer is null). Hmm ..there should be a better check for kernel threads. -- Regards, vatsa ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-21 17:32 ` Srivatsa Vaddagiri @ 2007-06-21 17:51 ` Paul Jackson 2007-06-22 17:16 ` Srivatsa Vaddagiri 0 siblings, 1 reply; 8+ messages in thread From: Paul Jackson @ 2007-06-21 17:51 UTC (permalink / raw) To: vatsa; +Cc: mingo, clameter, linux-kernel, dino, akpm > Yeah .."allow movement if it doesn't result in changing kernel-threads's cpu > affinity" sounds good, except it is hard to implement in cpuset's > context I think. For ex: we now have to take additional steps when > changing 'cpus_allowed' of a cpuset such that it doesn't violate any cpu > affinity of kernel threads bound to the cpuset. That itself makes the > implementation complex I think. Yes, that would be more complex than we need. You're right. The only problem comes with kernel tasks that are pinned to less than the entire system, and that are in the top cpuset. They should remain in the top cpuset. Since the top cpuset is never allowed to have less than all the cpus in the system, such kernel threads can then be assured of being allowed to use the cpus they need to use. Could you try coding something like the following in kernel/cpuset.c, Srivatsa? /* Call with task_lock held on tsk */ is_pinned_kernel_thread(tsk) { /* "pinned" means constrained to a strict subset of online cpus */ int is_pinned = !cpus_subset(cpus_online_map, tsk->cpus_allowed); int is_kthread = !tsk->mm || (tsk->flags & PF_BORROWED_MM); return is_pinned && is_kthread; } Then call it from kernel/cpuset.c:attach_task() with: /* Don't moved pinned kernel threads out of top cpuset */ if (is_pinned_kernel_thread(tsk) && oldcs == &top_cpuset && cs != oldcs) { task_unlock(tsk); mutex_unlock(&callback_mutex); put_task_struct(tsk); return -EINVAL; } -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.925.600.0401 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-21 17:51 ` Paul Jackson @ 2007-06-22 17:16 ` Srivatsa Vaddagiri 2007-06-22 17:41 ` Paul Jackson 0 siblings, 1 reply; 8+ messages in thread From: Srivatsa Vaddagiri @ 2007-06-22 17:16 UTC (permalink / raw) To: Paul Jackson; +Cc: mingo, clameter, linux-kernel, dino, akpm On Thu, Jun 21, 2007 at 10:51:52AM -0700, Paul Jackson wrote: > The only problem comes with kernel tasks that are pinned to less than > the entire system, and that are in the top cpuset. That again is not fool-proof. What if kernel-tasks change their cpu affinity after we have done the is_pinned_kernel_thread() test? Ideally they should not, but one never knows! IMHO we simply should not allow kernel threads to move out of top-cpuset (unless you know of a good reason where we may want to move them). int cpuset_can_attach() { int is_kthread = !tsk->mm || (tsk->flags & PF_BORROWED_MM); ... /* Don't moved pinned kernel threads out of top cpuset */ if (is_kthread && oldcs == &top_cpuset && cs != oldcs) { task_unlock(tsk); mutex_unlock(&callback_mutex); put_task_struct(tsk); return -EINVAL; } } What do you think? -- Regards, vatsa ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuset attach_task to touch per-cpu kernel threads? 2007-06-22 17:16 ` Srivatsa Vaddagiri @ 2007-06-22 17:41 ` Paul Jackson 0 siblings, 0 replies; 8+ messages in thread From: Paul Jackson @ 2007-06-22 17:41 UTC (permalink / raw) To: vatsa; +Cc: mingo, clameter, linux-kernel, dino, akpm Srivatsa wrote: > That again is not fool-proof. What if kernel-tasks change their cpu affinity > after we have done the is_pinned_kernel_thread() test? Ideally they > should not, but one never knows! > > IMHO we simply should not allow kernel threads to move out of top-cpuset Well ... in some theoretical world, perhaps. In reality, there a big pile of kernel threads that we want to move out of the root cpuset. And in reality, kernel threads don't change from being unrestricted (all cpus allowed) to being pinned (to some specific subset of cpus). Kernel threads that need to be pinned know it up front; it's an essential part of whatever they do. I've been working with installed customer configurations for about two and a half years now that move unpinned kernel threads out of the top cpuset, as part of keeping portions of the system freed up for dedicated jobs. I cannot agree to removing this capability. Nack. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.925.600.0401 ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-06-22 17:41 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-06-21 1:49 cpuset attach_task to touch per-cpu kernel threads? Srivatsa Vaddagiri 2007-06-21 2:35 ` Paul Jackson 2007-06-21 12:16 ` Ingo Molnar 2007-06-21 17:07 ` Paul Jackson 2007-06-21 17:32 ` Srivatsa Vaddagiri 2007-06-21 17:51 ` Paul Jackson 2007-06-22 17:16 ` Srivatsa Vaddagiri 2007-06-22 17:41 ` Paul Jackson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox