* Re: Xterm Hangs - Possible scheduler defect? [not found] ` <30111.1109237503@www1.gmx.net> @ 2005-02-24 17:53 ` Chad N. Tindel 2005-02-24 18:19 ` Chris Friesen 2005-02-25 4:25 ` Mike Galbraith 0 siblings, 2 replies; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 17:53 UTC (permalink / raw) To: Mike Galbraith; +Cc: akpm, linux-kernel > > Hmmm... Are you suggesting it is OK for a kernel to get nearly completely > > hosed and for not fully utilize all the processors in the system because > > of one SCHED_FIFO thread? > > Sure. You specifically directed the scheduler to run your thread at a > higher priority than anything else. The way I see it, you used root's > perogative to shoot himself in the foot. You could also have used root's > perogative to don steel toed shoes(set important kernel threads to a higher > priority) before pulling the trigger. No, I specifically directed the scheduler to run my thread at a higher priority than any other userspace application. The fact that I wrote it in userspace and not in kernel space implies that I am OK with the kernel stopping me sometimes when _it_ has work to do. If I wanted something higher priority than the kernel I would have written something in kernel space instead. > SCHED_FIFO thread are supposed to preempt > > all other userspace threads... not the kernel itself. > > Not so. The scheduler makes do distinction between user and kernel threads > of execution. That is SOOOO broken it isn't even funny. > If you think that's broken, you'll _love_ Ingo's IRQ threads. While testing > one of his recent (slightly buggy)unpriveleged-user-does-RT patches in an > IRQ threadified kernel, I ran a user SCHED_FIFO task at higher than the IRQ0 > thread... if my box had been an embeded washing machine controller instead > of a desktop box, it'd have been a classic case of "No tickie no washie" :)) Yeah, thats broken too. Perhaps I don't understand this philosophy you have where the kernel isn't more important than everything else. It seems to me like there needs to be a rigid hierarchy for scheduling, lest you get into deadlock problems: 1. Kernel preempts all. There may be some hierarchy of kernel priorities too, but it isn't important here. 2. SCHED_FIFO processes preempt all userspace applications. 3. SCHED_RR. 4. SCHED_OTHER. Under no circumstances should any single CPU-bound userspace thread completely hose a 64-way SMP box. Can somebody educate me on why it is correct to do it any other way? Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 17:53 ` Xterm Hangs - Possible scheduler defect? Chad N. Tindel @ 2005-02-24 18:19 ` Chris Friesen 2005-02-24 18:38 ` Chad N. Tindel 2005-02-25 4:25 ` Mike Galbraith 1 sibling, 1 reply; 33+ messages in thread From: Chris Friesen @ 2005-02-24 18:19 UTC (permalink / raw) To: Chad N. Tindel; +Cc: Mike Galbraith, akpm, linux-kernel Chad N. Tindel wrote: > 1. Kernel preempts all. There may be some hierarchy of kernel priorities > too, but it isn't important here. > 2. SCHED_FIFO processes preempt all userspace applications. > 3. SCHED_RR. > 4. SCHED_OTHER. > > Under no circumstances should any single CPU-bound userspace thread completely > hose a 64-way SMP box. > > Can somebody educate me on why it is correct to do it any other way? Low-latency userspace apps. The audio guys, for instance, are trying to get latencies down to the 100us range. If random kernel threads can preempt userspace at any time, they wreak havoc with latency as seen by userspace. Chris ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 18:19 ` Chris Friesen @ 2005-02-24 18:38 ` Chad N. Tindel 2005-02-24 19:04 ` Paulo Marques 0 siblings, 1 reply; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 18:38 UTC (permalink / raw) To: Chris Friesen; +Cc: Mike Galbraith, akpm, linux-kernel > Low-latency userspace apps. The audio guys, for instance, are trying to > get latencies down to the 100us range. > > If random kernel threads can preempt userspace at any time, they wreak > havoc with latency as seen by userspace. Come now. There is no such thing as a random kernel thread. Any General Purpose kernel needs the ability to do work that keeps the entire system from grinding to a halt. Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 18:38 ` Chad N. Tindel @ 2005-02-24 19:04 ` Paulo Marques 2005-02-24 19:22 ` Chad N. Tindel 0 siblings, 1 reply; 33+ messages in thread From: Paulo Marques @ 2005-02-24 19:04 UTC (permalink / raw) To: Chad N. Tindel; +Cc: Chris Friesen, Mike Galbraith, akpm, linux-kernel Chad N. Tindel wrote: >>Low-latency userspace apps. The audio guys, for instance, are trying to >>get latencies down to the 100us range. >> >>If random kernel threads can preempt userspace at any time, they wreak >>havoc with latency as seen by userspace. > > > Come now. There is no such thing as a random kernel thread. Any General > Purpose kernel needs the ability to do work that keeps the entire system from > grinding to a halt. FYI most kernel threads do background work, that doesn't have hard real-time constraints. Why should my audio recording session get interrupted (read: "sent to the trashcan") just because the swap daemon decided that it was a good time to write some pages out? Couldn't it have waited just a few more milliseconds? You don't seem to realize that you have just arrived to this mailing list and missed years of discussions on kernel architecture. If you keep a learning attitude, there is a chance for this discussion to go on. However, if you keep the "Come now, don't bullshit me, this is a broken architecture and you're just trying to cover up" attitude, you're just going to get discarded as a troll. I personally like the linux way: "root has the ability to shoot himself in the foot if he wants to". This is my computer, damn it, I am the one who tells it what to do. This is much, much better than the "users are stupid, we must protect them from themselves" kind of way that other OS'es use. Just my 0.02 euros, -- Paulo Marques - www.grupopie.com All that is necessary for the triumph of evil is that good men do nothing. Edmund Burke (1729 - 1797) ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 19:04 ` Paulo Marques @ 2005-02-24 19:22 ` Chad N. Tindel 2005-02-24 19:46 ` Chris Friesen ` (2 more replies) 0 siblings, 3 replies; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 19:22 UTC (permalink / raw) To: Paulo Marques Cc: Chad N. Tindel, Chris Friesen, Mike Galbraith, akpm, linux-kernel > If you keep a learning attitude, there is a chance for this discussion > to go on. However, if you keep the "Come now, don't bullshit me, this is > a broken architecture and you're just trying to cover up" attitude, > you're just going to get discarded as a troll. I'm not trying to troll here; I suppose I'm just coming from a different background. I'll try to adjust my tone. > I personally like the linux way: "root has the ability to shoot himself > in the foot if he wants to". This is my computer, damn it, I am the one > who tells it what to do. I'm all for allowing people to shoot themselves in the foot. That doesn't mean that it is OK for a single userspace thread to mess up a 64-way box. > This is much, much better than the "users are stupid, we must protect > them from themselves" kind of way that other OS'es use. Isn't this what the kernel attempts to do in many other places? What else could possibly be the point of sending SIGSEGV and causing applications to dump core when they make a mistake referencing memory? Isn't it the kernel's job to protect one application from another? I see what you're saying about the swap daemon. How about if I make my statement less black and white. Some kernel threads should always have priority over userspace. I believe the mindset required for a home system that is doing audio recordings is different than the one for enterprise-level systems. How do we unify the two? Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 19:22 ` Chad N. Tindel @ 2005-02-24 19:46 ` Chris Friesen 2005-02-24 20:08 ` Chad N. Tindel 2005-02-24 19:52 ` Barry K. Nathan 2005-02-25 20:25 ` Helge Hafting 2 siblings, 1 reply; 33+ messages in thread From: Chris Friesen @ 2005-02-24 19:46 UTC (permalink / raw) To: Chad N. Tindel; +Cc: Paulo Marques, Mike Galbraith, akpm, linux-kernel Chad N. Tindel wrote: >>I personally like the linux way: "root has the ability to shoot himself >>in the foot if he wants to". This is my computer, damn it, I am the one >>who tells it what to do. > I'm all for allowing people to shoot themselves in the foot. That doesn't > mean that it is OK for a single userspace thread to mess up a 64-way box. If root has explicitly stated that the thread in question is the highest priority thing on the entire machine, why would it not be okay. The fact that root made a mistake is the issue here, not that the system didn't protect itself. >>This is much, much better than the "users are stupid, we must protect >>them from themselves" kind of way that other OS'es use. > Isn't this what the kernel attempts to do in many other places? What else > could possibly be the point of sending SIGSEGV and causing applications > to dump core when they make a mistake referencing memory? Isn't it the > kernel's job to protect one application from another? Yes. But at the same time, if root wants to do something, the kernel should let them. There are many, many ways that root could trash the system. "cat /dev/urandom > /dev/kcore" would do quite nicely. > I see what you're saying about the swap daemon. How about if I make my > statement less black and white. Some kernel threads should always have > priority over userspace. Not necessarily. The latest latency-reduction patches allow root to specify exactly what the priorities should be. > I believe the mindset required for a home system that is doing audio recordings > is different than the one for enterprise-level systems. How do we unify > the two? There are professionals who use linux for audio as well, it's not just home systems. That said, you unify them with reasonable defaults, and the ability for root to override them. Chris ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 19:46 ` Chris Friesen @ 2005-02-24 20:08 ` Chad N. Tindel 2005-02-24 20:29 ` Chris Friesen 2005-02-25 0:51 ` Ingo Oeser 0 siblings, 2 replies; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 20:08 UTC (permalink / raw) To: Chris Friesen; +Cc: Paulo Marques, Mike Galbraith, akpm, linux-kernel > >I'm all for allowing people to shoot themselves in the foot. That doesn't > >mean that it is OK for a single userspace thread to mess up a 64-way box. > > If root has explicitly stated that the thread in question is the highest > priority thing on the entire machine, why would it not be okay. The > fact that root made a mistake is the issue here, not that the system > didn't protect itself. Yeah, I realized when I left for lunch that this statement wasn't as clear as I would like it to be. I think what we have are the need for two levels of applications: 1. That which wishes to be the highest priority userspace application, and wishes to preempt all other userspace applications. Such an application is OK being preempted by the kernel when the kernel needs to do work. IMHO, this should be the default behavior for any SCHED_FIFO application. If one of these has a bug and goes CPU-bound, the worst it can do is prevent other apps from ever using the CPU it is on. 2. Applications which actually want to be the highest priority thing on the system, including being higher than the kernel. These applications are OK with the fact that they may cause system hangs and deadlocks, and are careful not to shoot themselves in the foot. > There are professionals who use linux for audio as well, it's not just > home systems. That said, you unify them with reasonable defaults, and > the ability for root to override them. OK. Would you say it would be a reasonable default to have SCHED_FIFO userspace threads running at a lower priority than essential kernel threads (say, the load balancer and the events thread), and give root the ability to explicitly have userspace threads preempt the kernel? Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 20:08 ` Chad N. Tindel @ 2005-02-24 20:29 ` Chris Friesen 2005-02-25 0:51 ` Ingo Oeser 1 sibling, 0 replies; 33+ messages in thread From: Chris Friesen @ 2005-02-24 20:29 UTC (permalink / raw) To: Chad N. Tindel; +Cc: Paulo Marques, Mike Galbraith, akpm, linux-kernel Chad N. Tindel wrote: > OK. Would you say it would be a reasonable default to have SCHED_FIFO userspace > threads running at a lower priority than essential kernel threads (say, the > load balancer and the events thread), and give root the ability to explicitly > have userspace threads preempt the kernel? The current scheduler has a 1-100 priority range for soft realtime tasks. To insert a task into a realtime class, you need to have root privileges. As long as you make sure that kernel threads get set to higher priorities than your user threads, then you get the above behaviour. Ultimately, however, the administrator is responsable for ensuring that everything is running with sane priority levels. Chris ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 20:08 ` Chad N. Tindel 2005-02-24 20:29 ` Chris Friesen @ 2005-02-25 0:51 ` Ingo Oeser 2005-02-25 15:12 ` Chris Friesen 1 sibling, 1 reply; 33+ messages in thread From: Ingo Oeser @ 2005-02-25 0:51 UTC (permalink / raw) To: Chad N. Tindel Cc: Chris Friesen, Paulo Marques, Mike Galbraith, akpm, linux-kernel Chad N. Tindel wrote: > I think what we have are the need for two levels of applications: > > 1. That which wishes to be the highest priority userspace application, and > wishes to preempt all other userspace applications. Such an application is > OK being preempted by the kernel when the kernel needs to do work. IMHO, > this should be the default behavior for any SCHED_FIFO application. If one > of these has a bug and goes CPU-bound, the worst it can do is prevent other > apps from ever using the CPU it is on. That is basically, what you do with SCHED_RR. (Be preempted after maximum quantum, even if having work to do) > 2. Applications which actually want to be the highest priority thing on > the system, including being higher than the kernel. These applications are > OK with the fact that they may cause system hangs and deadlocks, and are > careful not to shoot themselves in the foot. This is SCHED_FIFO. (Strict priority scheduling, allowed to starve anything below) So just try to use the right scheduler for your application right now, ok? If your system is busy with top priority task, why should the kernel disturb it? Things will stop anyway, if your high priority task is needing a resource, which is blocked. Than it becomes unrunnable and other tasks have chances to continue. Kernel threads are likely to execute then, because they are likely runnable then. Your task could even migrate, if a lot of kernel tasks are waiting in one CPU and your task is NOT bound to a specific CPU. So the system is not brought down, but just busy in a infortunate way. Stupid applications can starve other applications for a while, but not forever, because the kernel is still running and deciding. Regards Ingo Oeser ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 0:51 ` Ingo Oeser @ 2005-02-25 15:12 ` Chris Friesen 2005-02-25 15:39 ` Ingo Oeser 0 siblings, 1 reply; 33+ messages in thread From: Chris Friesen @ 2005-02-25 15:12 UTC (permalink / raw) To: Ingo Oeser Cc: Chad N. Tindel, Paulo Marques, Mike Galbraith, akpm, linux-kernel Ingo Oeser wrote: > Stupid applications can starve other applications for a while, but not > forever, because the kernel is still running and deciding. Not so. task 1: sched_rr, priority 1, takes mutex task 2: sched_rr, priority 2, cpu hog, infinite loop task 3: sched_rr, priority 99, tries to get mutex And now tasks 1 and 3 are starved forever. Arguably bad application design, but it demonstrates a case where applications can starve other applications. Chris ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 15:12 ` Chris Friesen @ 2005-02-25 15:39 ` Ingo Oeser 2005-02-25 15:53 ` Paulo Marques 0 siblings, 1 reply; 33+ messages in thread From: Ingo Oeser @ 2005-02-25 15:39 UTC (permalink / raw) To: Chris Friesen Cc: Chad N. Tindel, Paulo Marques, Mike Galbraith, akpm, linux-kernel Chris Friesen wrote: > Ingo Oeser wrote: > > Stupid applications can starve other applications for a while, but not > > forever, because the kernel is still running and deciding. > > Not so. > > > > task 1: sched_rr, priority 1, takes mutex > task 2: sched_rr, priority 2, cpu hog, infinite loop > task 3: sched_rr, priority 99, tries to get mutex > > And now tasks 1 and 3 are starved forever. Arguably bad application > design, but it demonstrates a case where applications can starve other > applications. You are right. In "If a SCHED_RR process has been running for a time period equal to or longer than the time quantum, it will be put at the end of the list for its priority" I missed the "for its priority" part. You would need to change the priority of task 1 until it releases the mutex. Ideally the owner gets the maximum priority of his and all the waiters on it, until it releases his mutex, where he regains its old priority after release of mutex. But this priority elevation happens only, if he is runnable. If not, he gets his old priority back, until he is runnable. But then again you just need to grab a mutex shared with a high priority task and consume CPU. Since this behavior is not defined in POSIX AFAIK, you just have to write your applications properly or use SCHED_OTHER for CPU hogging. Regards Ingo Oeser ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 15:39 ` Ingo Oeser @ 2005-02-25 15:53 ` Paulo Marques 2005-02-25 16:24 ` Lee Revell 0 siblings, 1 reply; 33+ messages in thread From: Paulo Marques @ 2005-02-25 15:53 UTC (permalink / raw) To: Ingo Oeser Cc: Chris Friesen, Chad N. Tindel, Mike Galbraith, akpm, linux-kernel Ingo Oeser wrote: > Chris Friesen wrote: > >>Ingo Oeser wrote: >>[...] > You would need to change the priority of task 1 until it releases the > mutex. Ideally the owner gets the maximum priority of > his and all the waiters on it, until it releases his mutex, where he regains > its old priority after release of mutex. But this priority elevation happens > only, if he is runnable. If not, he gets his old priority back, until he is > runnable. This is called a "priority inversion" problem, and there was some work done by Ingo Molnar to make the scheduler aware of such cases and handle them appropriatelly. You can follow this thread for more info: http://marc.theaimsgroup.com/?l=linux-kernel&m=110106915415886&w=2 I really don't know what's the current state, but this is nothing new... -- Paulo Marques - www.grupopie.com All that is necessary for the triumph of evil is that good men do nothing. Edmund Burke (1729 - 1797) ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 15:53 ` Paulo Marques @ 2005-02-25 16:24 ` Lee Revell 2005-02-25 17:07 ` Chris Friesen 0 siblings, 1 reply; 33+ messages in thread From: Lee Revell @ 2005-02-25 16:24 UTC (permalink / raw) To: Paulo Marques Cc: Ingo Oeser, Chris Friesen, Chad N. Tindel, Mike Galbraith, akpm, linux-kernel On Fri, 2005-02-25 at 15:53 +0000, Paulo Marques wrote: > Ingo Oeser wrote: > > Chris Friesen wrote: > > > >>Ingo Oeser wrote: > >>[...] > > You would need to change the priority of task 1 until it releases the > > mutex. Ideally the owner gets the maximum priority of > > his and all the waiters on it, until it releases his mutex, where he regains > > its old priority after release of mutex. But this priority elevation happens > > only, if he is runnable. If not, he gets his old priority back, until he is > > runnable. > > This is called a "priority inversion" problem, and there was some work > done by Ingo Molnar to make the scheduler aware of such cases and handle > them appropriatelly. > > You can follow this thread for more info: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=110106915415886&w=2 > The solution to your problem (which is as old as the hills) involves priority inheriting mutexes which are available in the RT preempt patch (if you build with CONFIG_PREEMPT_RT). This should be usable for hard realtime applications. http://people.redhat.com/mingo/realtime-preempt If you just need very good soft realtime performance I recommend PREEMPT_DESKTOP. Lee ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 16:24 ` Lee Revell @ 2005-02-25 17:07 ` Chris Friesen 0 siblings, 0 replies; 33+ messages in thread From: Chris Friesen @ 2005-02-25 17:07 UTC (permalink / raw) To: Lee Revell Cc: Paulo Marques, Ingo Oeser, Chad N. Tindel, Mike Galbraith, akpm, linux-kernel Lee Revell wrote: > The solution to your problem (which is as old as the hills) involves > priority inheriting mutexes which are available in the RT preempt patch > (if you build with CONFIG_PREEMPT_RT). This should be usable for hard > realtime applications. Yup. I was just pointing out that userspace apps *can* block other userspace apps. > http://people.redhat.com/mingo/realtime-preempt > > If you just need very good soft realtime performance I recommend > PREEMPT_DESKTOP. How does this compare with Inaky's "robust mutexes" patch? Chris ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 19:22 ` Chad N. Tindel 2005-02-24 19:46 ` Chris Friesen @ 2005-02-24 19:52 ` Barry K. Nathan 2005-02-25 20:25 ` Helge Hafting 2 siblings, 0 replies; 33+ messages in thread From: Barry K. Nathan @ 2005-02-24 19:52 UTC (permalink / raw) To: Chad N. Tindel Cc: Paulo Marques, Chris Friesen, Mike Galbraith, akpm, linux-kernel > > This is much, much better than the "users are stupid, we must protect > > them from themselves" kind of way that other OS'es use. > > Isn't this what the kernel attempts to do in many other places? What else > could possibly be the point of sending SIGSEGV and causing applications > to dump core when they make a mistake referencing memory? Isn't it the > kernel's job to protect one application from another? A related example: Typically, when a program (even when running as root) attempts to access I/O ports directly, it gets killed as you describe. However, the X server, running as root, can use ioperm or iopl to request permission to access the video card's I/O ports directly. When it gets that permission, it can do that and no longer gets killed. It also means the X server is capable of bringing the entire system via errant I/O port accesses if it wishes (or if it misbehaves). The general philosophy is to protect one application from another, unless an application has been specifically granted sufficient power to wreck the system. I don't remember off the top of my head whether SCHED_FIFO tasks are supposed to be able to take SMP systems down, if the # of SCHED_FIFO tasks is less than the # of CPU's. I imagine someone has thought about this in the past and answered the question one way or another, but I don't happen to know the answer. -Barry K. Nathan <barryn@pobox.com> ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 19:22 ` Chad N. Tindel 2005-02-24 19:46 ` Chris Friesen 2005-02-24 19:52 ` Barry K. Nathan @ 2005-02-25 20:25 ` Helge Hafting 2005-02-25 21:02 ` Chad N. Tindel 2 siblings, 1 reply; 33+ messages in thread From: Helge Hafting @ 2005-02-25 20:25 UTC (permalink / raw) To: Chad N. Tindel Cc: Paulo Marques, Chris Friesen, Mike Galbraith, akpm, linux-kernel On Thu, Feb 24, 2005 at 02:22:37PM -0500, Chad N. Tindel wrote: > > If you keep a learning attitude, there is a chance for this discussion > > to go on. However, if you keep the "Come now, don't bullshit me, this is > > a broken architecture and you're just trying to cover up" attitude, > > you're just going to get discarded as a troll. > > I'm not trying to troll here; I suppose I'm just coming from a different > background. I'll try to adjust my tone. > > > I personally like the linux way: "root has the ability to shoot himself > > in the foot if he wants to". This is my computer, damn it, I am the one > > who tells it what to do. > > I'm all for allowing people to shoot themselves in the foot. That doesn't > mean that it is OK for a single userspace thread to mess up a 64-way box. > What's so special about a 64-way box? Note that the box wasn't messed up - the thread merely used too much cpu. It is perfectly ok - even on a 64-way box - to have a thread that runs with higher priority than all the kernel threads - *�if* it occationally sleeps. That means the thread can get very low latency work done, and the kernel threads will simply wait a little. Then the thread sleeps, and those cruical kernel threads move on. A high-priority thread that doesn't run all the time is no problem. and it may need the ability to preempt kernel threads occationally due to timing constraints. In the case mentioned, the high-priority thread ran all the time. That's bad, but there is no way the kernel can guess that is was a bad idea in that case. The kernel does what it is told. An ordinary user can�'t use such priorities, so there is no security problem here. Only root can, and root has the power to disrupt service anyway (shutdown, kill any process, delete any file.) Someone who runs as root is _trusted_ to do the right thing, this trust might be outside the scope of the os. In other words, some people are allowed to run special processes, by the machine owner. Some gets the root password - and they are supposed to be above the "crowds" and not crash the machine just because they can. Helge Hafting ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 20:25 ` Helge Hafting @ 2005-02-25 21:02 ` Chad N. Tindel 2005-02-25 23:24 ` Lee Revell 2005-02-26 11:58 ` Helge Hafting 0 siblings, 2 replies; 33+ messages in thread From: Chad N. Tindel @ 2005-02-25 21:02 UTC (permalink / raw) To: Helge Hafting Cc: Paulo Marques, Chris Friesen, Mike Galbraith, akpm, linux-kernel > What's so special about a 64-way box? They're expensive and customers don't expect a single userspace thread to tie up the other 63 CPUs no matter how buggy it is. It is intuitively obvious that a buggy kernel can bring a system to its knees, but it is not intuitively obvious that a buggy userspace app can do the same thing. It is more of a supportability issue than anything, because you expect the other processors to function properly so you can get in and live-debug the application when it hits a bug that makes it CPU-bound. This is especially important if the box is, say, in a remote jungle of China or something where you don't have access to the console. The horse is dead, so lets not beat it anymore for the time being. It is quite clear that people don't want Linux to (by default) not have the gun cocked and pointed at the application developer's feet. People who want a kernel that doesn't hang in the face of bad-acting userspace apps can change the priority of important kernel threads, which seems like a reasonable workaround for now. Regards, Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 21:02 ` Chad N. Tindel @ 2005-02-25 23:24 ` Lee Revell 2005-02-26 11:58 ` Helge Hafting 1 sibling, 0 replies; 33+ messages in thread From: Lee Revell @ 2005-02-25 23:24 UTC (permalink / raw) To: Chad N. Tindel Cc: Helge Hafting, Paulo Marques, Chris Friesen, Mike Galbraith, akpm, linux-kernel On Fri, 2005-02-25 at 16:02 -0500, Chad N. Tindel wrote: > They're expensive and customers don't expect a single userspace thread to > tie up the other 63 CPUs no matter how buggy it is. It is intuitively obvious > that a buggy kernel can bring a system to its knees, but it is not intuitively > obvious that a buggy userspace app can do the same thing. It is more of a > supportability issue than anything, because you expect the other processors > to function properly so you can get in and live-debug the application when it > hits a bug that makes it CPU-bound. This is especially important if the box > is, say, in a remote jungle of China or something where you don't have access > to the console. "Unix policy is to not stop root from doing stupid things because that would also stop him from doing clever things." - Andi Kleen "It's such a fine line between stupid and clever" - Derek Smalls Lee ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-25 21:02 ` Chad N. Tindel 2005-02-25 23:24 ` Lee Revell @ 2005-02-26 11:58 ` Helge Hafting 1 sibling, 0 replies; 33+ messages in thread From: Helge Hafting @ 2005-02-26 11:58 UTC (permalink / raw) To: Chad N. Tindel Cc: Helge Hafting, Paulo Marques, Chris Friesen, Mike Galbraith, akpm, linux-kernel On Fri, Feb 25, 2005 at 04:02:26PM -0500, Chad N. Tindel wrote: > > What's so special about a 64-way box? > > They're expensive and customers don't expect a single userspace thread to > tie up the other 63 CPUs no matter how buggy it is. It is intuitively obvious > that a buggy kernel can bring a system to its knees, but it is not intuitively > obvious that a buggy userspace app can do the same thing. It is more of a > supportability issue than anything, because you expect the other processors > to function properly so you can get in and live-debug the application when it > hits a bug that makes it CPU-bound. This is especially important if the box > is, say, in a remote jungle of China or something where you don't have access > to the console. > These are very good points. And the solution exists - if you want these options then simply run the program at a lower priority than the kernel threads. Doing this is not a problem. You _can_ run a process at highest priority, but you don't have to! > The horse is dead, so lets not beat it anymore for the time being. It is > quite clear that people don't want Linux to (by default) not have the gun > cocked and pointed at the application developer's feet. Linux is safe, and you bring up a non-issue. So what if the app couldn't get higher priority than kernel threads? You could then implement it as a kernel thread and get the same problem anyway. No difference. > People who want a > kernel that doesn't hang in the face of bad-acting userspace apps can change > the priority of important kernel threads, which seems like a reasonable > workaround for now. > Yes, or they can simply run the app at a slightly lower priority until it is fully tested so they know it can be trusted. People sometimes need to not be delayed by kernel threads, and that is not a problem as long as the application gives up the cpu after it finishes doing the time-critical work. We want linux to be able to do these kinds of work too. saying that the os doesn't have control does not make sense. The os will give away a cpu - but only if _you_ let it. Helge Hafting ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 17:53 ` Xterm Hangs - Possible scheduler defect? Chad N. Tindel 2005-02-24 18:19 ` Chris Friesen @ 2005-02-25 4:25 ` Mike Galbraith 1 sibling, 0 replies; 33+ messages in thread From: Mike Galbraith @ 2005-02-25 4:25 UTC (permalink / raw) To: Chad N. Tindel; +Cc: akpm, linux-kernel At 12:53 PM 2/24/2005 -0500, Chad N. Tindel wrote: > > > Hmmm... Are you suggesting it is OK for a kernel to get nearly completely > > > hosed and for not fully utilize all the processors in the system because > > > of one SCHED_FIFO thread? > > > > Sure. You specifically directed the scheduler to run your thread at a > > higher priority than anything else. The way I see it, you used root's > > perogative to shoot himself in the foot. You could also have used root's > > perogative to don steel toed shoes(set important kernel threads to a higher > > priority) before pulling the trigger. > >No, I specifically directed the scheduler to run my thread at a higher >priority than any other userspace application. The fact that I wrote it >in userspace and not in kernel space implies that I am OK with the kernel >stopping me sometimes when _it_ has work to do. If I wanted something >higher priority than the kernel I would have written something in kernel >space instead. Nope. You may have _thought_ you told it that, but the reality is as I described it. > > SCHED_FIFO thread are supposed to preempt > > > all other userspace threads... not the kernel itself. > > > > Not so. The scheduler makes do distinction between user and kernel threads > > of execution. > >That is SOOOO broken it isn't even funny. I heartily disagree. I call it flexible/powerful. > > If you think that's broken, you'll _love_ Ingo's IRQ threads... > >Yeah, thats broken too. (You're not noticing the added power it gives you.) >Perhaps I don't understand this philosophy you have where the kernel >isn't more important than everything else. It seems to me like there needs >to be a rigid hierarchy for scheduling, lest you get into deadlock problems: Some kernel thread flushing buffers should be more important than my userland trigger-pacemaker thread? >Under no circumstances should any single CPU-bound userspace thread >completely >hose a 64-way SMP box. I can certainly agree that any service which is required across processor borders wants to be very high priority indeed, and I can further agree that this crossing of borders would not exist in a perfect world. -Mike ^ permalink raw reply [flat|nested] 33+ messages in thread
* Xterm Hangs - Possible scheduler defect?
@ 2005-02-23 23:06 Chad N. Tindel
2005-02-24 2:36 ` Andrew Morton
0 siblings, 1 reply; 33+ messages in thread
From: Chad N. Tindel @ 2005-02-23 23:06 UTC (permalink / raw)
To: linux-kernel
Hello-
We have hit a defect where an exiting xterm process will hang. This is running
on a 2-cpu IA-64 box. We have a multithreaded application, where one thread
is SCHED_FIFO and is running with priority 98, and the other thread is just
a normal SCHED_OTHER thread. The SCHED_FIFO thread is in a CPU bound tight
loop, but I wouldn't expect that to cause since there are 2 CPUs.
However, it does seem to cause some problems. For example, if you ssh into
the system and run an Xterm using X11 forwarding, when you type "exit" in
the xterm window, the window hangs and doesn't close. Killing the CPU-bound
app causes the window to exit immediately. The sysrq output shows the
following:
xterm D a0000001000bef60 0 2905 2876 (NOTLB)
Call Trace:
[<a0000001004ac480>] schedule+0xca0/0x1300
sp=e000000012257d20 bsp=e000000012251080
[<a0000001000bef60>] flush_cpu_workqueue+0x1a0/0x4a0
sp=e000000012257d30 bsp=e000000012251020
[<a0000001000bf360>] flush_workqueue+0x100/0x160
sp=e000000012257d90 bsp=e000000012250fe8
[<a0000001000bfd60>] flush_scheduled_work+0x20/0x40
sp=e000000012257d90 bsp=e000000012250fd0
[<a0000001002e2060>] release_dev+0x8e0/0x1100
sp=e000000012257d90 bsp=e000000012250f20
[<a0000001002e3350>] tty_release+0x30/0x60
sp=e000000012257e30 bsp=e000000012250ef8
[<a00000010012d430>] __fput+0x330/0x340
sp=e000000012257e30 bsp=e000000012250ea8
[<a00000010012d0e0>] fput+0x40/0x60
sp=e000000012257e30 bsp=e000000012250e88
[<a00000010012a1b0>] filp_close+0xd0/0x160
sp=e000000012257e30 bsp=e000000012250e58
[<a00000010012a380>] sys_close+0x140/0x1a0
sp=e000000012257e30 bsp=e000000012250dd8
[<a00000010000aba0>] ia64_ret_from_syscall+0x0/0x20
sp=e000000012257e30 bsp=e000000012250dd8
So it would appear that xterm is hung in close() trying to shutdown a tty.
The comment says that is calling flush_scheduled_work() to
"Wait for ->hangup_work and ->flip.work handlers to terminate". Perhaps there
is some locking issue that is causing these to not run and complete?
I'm a bit out of my space here... does anybody have any ideas? I've tried
this on both 2.6.8 and 2.6.10 with the same problem resulting.
Please make sure to CC me in any responses.
Regards,
Chad
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: Xterm Hangs - Possible scheduler defect? 2005-02-23 23:06 Chad N. Tindel @ 2005-02-24 2:36 ` Andrew Morton 2005-02-24 5:23 ` Chad N. Tindel 2005-02-24 5:26 ` Chad N. Tindel 0 siblings, 2 replies; 33+ messages in thread From: Andrew Morton @ 2005-02-24 2:36 UTC (permalink / raw) To: Chad N. Tindel; +Cc: linux-kernel "Chad N. Tindel" <chad@tindel.net> wrote: > > We have hit a defect where an exiting xterm process will hang. This is running > on a 2-cpu IA-64 box. We have a multithreaded application, where one thread > is SCHED_FIFO and is running with priority 98, and the other thread is just > a normal SCHED_OTHER thread. The SCHED_FIFO thread is in a CPU bound tight > loop, but I wouldn't expect that to cause since there are 2 CPUs. > > However, it does seem to cause some problems. For example, if you ssh into > the system and run an Xterm using X11 forwarding, when you type "exit" in > the xterm window, the window hangs and doesn't close. Killing the CPU-bound > app causes the window to exit immediately. The sysrq output shows the > following: > > xterm D a0000001000bef60 0 2905 2876 (NOTLB) > > Call Trace: > [<a0000001004ac480>] schedule+0xca0/0x1300 > sp=e000000012257d20 bsp=e000000012251080 > [<a0000001000bef60>] flush_cpu_workqueue+0x1a0/0x4a0 > sp=e000000012257d30 bsp=e000000012251020 > [<a0000001000bf360>] flush_workqueue+0x100/0x160 > sp=e000000012257d90 bsp=e000000012250fe8 > [<a0000001000bfd60>] flush_scheduled_work+0x20/0x40 > sp=e000000012257d90 bsp=e000000012250fd0 > [<a0000001002e2060>] release_dev+0x8e0/0x1100 > sp=e000000012257d90 bsp=e000000012250f20 > [<a0000001002e3350>] tty_release+0x30/0x60 > sp=e000000012257e30 bsp=e000000012250ef8 > [<a00000010012d430>] __fput+0x330/0x340 > sp=e000000012257e30 bsp=e000000012250ea8 > [<a00000010012d0e0>] fput+0x40/0x60 > sp=e000000012257e30 bsp=e000000012250e88 > [<a00000010012a1b0>] filp_close+0xd0/0x160 > sp=e000000012257e30 bsp=e000000012250e58 > [<a00000010012a380>] sys_close+0x140/0x1a0 > sp=e000000012257e30 bsp=e000000012250dd8 > [<a00000010000aba0>] ia64_ret_from_syscall+0x0/0x20 > sp=e000000012257e30 bsp=e000000012250dd8 > > So it would appear that xterm is hung in close() trying to shutdown a tty. > The comment says that is calling flush_scheduled_work() to > "Wait for ->hangup_work and ->flip.work handlers to terminate". Perhaps there > is some locking issue that is causing these to not run and complete? `xterm' is waiting for the other CPU to schedule a kernel thread (which is bound to that CPU). Once that kernel thread has done a little bit of work, `xterm' can terminate. But kernel threads don't run with realtime policy, so your userspace app has permanently starved that kernel thread. It's potentially quite a problem, really. For example it could prevent various tty operations from completing, it will prevent kjournald from ever writing back anything (on uniprocessor, etc). I've been waiting for someone to complain ;) But the other side of the coin is that a SCHED_FIFO userspace task presumably has extreme latency requirements, so it doesn't *want* to be preempted by some routine kernel operation. People would get irritated if we were to do that. So what to do? ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 2:36 ` Andrew Morton @ 2005-02-24 5:23 ` Chad N. Tindel 2005-02-24 6:50 ` Andrew Morton 2005-02-24 5:26 ` Chad N. Tindel 1 sibling, 1 reply; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 5:23 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel > `xterm' is waiting for the other CPU to schedule a kernel thread (which is > bound to that CPU). Once that kernel thread has done a little bit of work, > `xterm' can terminate. > > But kernel threads don't run with realtime policy, so your userspace app > has permanently starved that kernel thread. > > It's potentially quite a problem, really. For example it could prevent > various tty operations from completing, it will prevent kjournald from ever > writing back anything (on uniprocessor, etc). I've been waiting for > someone to complain ;) > > But the other side of the coin is that a SCHED_FIFO userspace task > presumably has extreme latency requirements, so it doesn't *want* to be > preempted by some routine kernel operation. People would get irritated if > we were to do that. > > So what to do? It shouldn't need to preempt the kernel operation. Why is the design such that the necessary kernel thread can't run on the other CPU? Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 5:23 ` Chad N. Tindel @ 2005-02-24 6:50 ` Andrew Morton 0 siblings, 0 replies; 33+ messages in thread From: Andrew Morton @ 2005-02-24 6:50 UTC (permalink / raw) To: Chad N. Tindel; +Cc: linux-kernel "Chad N. Tindel" <chad@tindel.net> wrote: > > > `xterm' is waiting for the other CPU to schedule a kernel thread (which is > > bound to that CPU). Once that kernel thread has done a little bit of work, > > `xterm' can terminate. > > > > But kernel threads don't run with realtime policy, so your userspace app > > has permanently starved that kernel thread. > > > > It's potentially quite a problem, really. For example it could prevent > > various tty operations from completing, it will prevent kjournald from ever > > writing back anything (on uniprocessor, etc). I've been waiting for > > someone to complain ;) > > > > But the other side of the coin is that a SCHED_FIFO userspace task > > presumably has extreme latency requirements, so it doesn't *want* to be > > preempted by some routine kernel operation. People would get irritated if > > we were to do that. > > > > So what to do? > > It shouldn't need to preempt the kernel operation. Why is the design such that > the necessary kernel thread can't run on the other CPU? > This particular kernel function is implemented via a kernel thread per CPU, with each thread bound to each CPU. The xterm-does-exit cleanup code is waiting for the thread which is bound to the busy CPU to do something. No other CPU can, or is allowed, to do that thread's work. If it were to do so, the implicit locking which we get from the per-cpuness would be violated. I don't know if any clients of the workqueue code rely upon the pinned-to-cpu feature. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 2:36 ` Andrew Morton 2005-02-24 5:23 ` Chad N. Tindel @ 2005-02-24 5:26 ` Chad N. Tindel 2005-02-24 13:25 ` Helge Hafting 1 sibling, 1 reply; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 5:26 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel > But the other side of the coin is that a SCHED_FIFO userspace task > presumably has extreme latency requirements, so it doesn't *want* to be > preempted by some routine kernel operation. People would get irritated if > we were to do that. Just to follow up a bit. People writing apps that run at SCHED_FIFO know that they aren't getting hard real-time, and they are OK with that. If they wanted something more they'd run on RTLinux. Why would it be wrong to preempt the SCHED_FIFO process in the case, assuming that it is too hard to fix a broken design that doesn't allow the necessary kernel threads to run on any CPU? Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 5:26 ` Chad N. Tindel @ 2005-02-24 13:25 ` Helge Hafting 2005-02-24 17:33 ` Chad N. Tindel 0 siblings, 1 reply; 33+ messages in thread From: Helge Hafting @ 2005-02-24 13:25 UTC (permalink / raw) To: Chad N. Tindel; +Cc: Andrew Morton, linux-kernel Chad N. Tindel wrote: >>But the other side of the coin is that a SCHED_FIFO userspace task >>presumably has extreme latency requirements, so it doesn't *want* to be >>preempted by some routine kernel operation. People would get irritated if >>we were to do that. >> >> > >Just to follow up a bit. People writing apps that run at SCHED_FIFO know >that they aren't getting hard real-time, and they are OK with that. If they >wanted something more they'd run on RTLinux. Why would it be wrong to preempt >the SCHED_FIFO process in the case, assuming that it is too hard to fix a broken >design that doesn't allow the necessary kernel threads to run on any CPU? > > Why would anyone write a thread than uses exactly 100% of one cpu? It seems wrong to me. It is unsafe if they really need that much processing power, what if an interrupt happens? Then they get slightly less. If they got a 10% faster cpu, would this thread suddenly drop to 90% usage (and no problems with kernel threads)? If it stays at 100% then that suggests they are using some sort of busy waiting which is bad programming. Get a better hw device that will provide an interrupt at the right time, and write a driver for that. Or find some spot in the code where a small delay in acceptable, and set a short timer and sleep on it. It doesn't take much to get this kernel thread going. Helge Hafting ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 13:25 ` Helge Hafting @ 2005-02-24 17:33 ` Chad N. Tindel 2005-02-24 22:25 ` Peter Chubb 2005-02-24 23:00 ` Andrew Morton 0 siblings, 2 replies; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 17:33 UTC (permalink / raw) To: Helge Hafting; +Cc: Andrew Morton, linux-kernel > Why would anyone write a thread than uses exactly 100% of one cpu? > It seems wrong to me. It is unsafe if they really need that much > processing power, what if an interrupt happens? Then they get slightly less. > If they got a 10% faster cpu, would this thread suddenly drop to 90% > usage (and no problems with kernel threads)? > If it stays at 100% then that suggests they are using some > sort of busy waiting which is bad programming. Get a better hw > device that will provide an interrupt at the right time, and write a > driver for > that. Or find some spot in the code where a small delay in acceptable, > and set a short timer and sleep on it. It doesn't take much to get this > kernel thread going. I would make the following assertion for any kernel: No single userspace thread of execution running on an SMP system should be able to hose a box by going CPU-bound, bug in the software or no bug. Any kernel should be able to handle this case and shift general work over to other processors. While I can't speak for all commercial Unixes, I know that HP-UX handles this case just fine. I'd be extremely surprised if Solaris and AIX didn't handle it fine too. What I can't understand is why you want to cop-out and say "Oh well this is just a bug in the application... the programmer shouldn't shoot himself in the foot." If that were the attitude that kernel programmers had, why have the kernel send SIGSEGV when applications reference invalid memory? Why not just let them corrupt the memory of other apps and possibly bring the whole system down? It is the kernel's job to protect itself and userspace applications from runaway applications whenever possible. While this might not be possible for this case on a UP box, it certainly is for an SMP box. Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 17:33 ` Chad N. Tindel @ 2005-02-24 22:25 ` Peter Chubb 2005-02-24 22:40 ` Chad N. Tindel 2005-02-24 23:00 ` Andrew Morton 1 sibling, 1 reply; 33+ messages in thread From: Peter Chubb @ 2005-02-24 22:25 UTC (permalink / raw) To: Chad N. Tindel; +Cc: Helge Hafting, Andrew Morton, linux-kernel >>>>> "Chad" == Chad N Tindel <chad@tindel.net> writes: Chad> I would make the following assertion for any kernel: Chad> No single userspace thread of execution running on an SMP system Chad> should be able to hose a box by going CPU-bound, bug in the Chad> software or no bug. Any kernel should be able to handle this Chad> case and shift general work over to other processors. In many Unices, crucial kernel threads run at realtime priority with a static priority higher than is accessible to user code. That being said, however, you've got to be a privileged user to set real time very high priority on a thread, and if you do, you'd better know what you're doing. Any SCHED_FIFO thread should run for a time, then sleep for a time, or it *will* DOS everything else on the processor. -- Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au The technical we do immediately, the political takes *forever* ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 22:25 ` Peter Chubb @ 2005-02-24 22:40 ` Chad N. Tindel 0 siblings, 0 replies; 33+ messages in thread From: Chad N. Tindel @ 2005-02-24 22:40 UTC (permalink / raw) To: Peter Chubb; +Cc: Helge Hafting, Andrew Morton, linux-kernel > In many Unices, crucial kernel threads run at realtime priority with a > static priority higher than is accessible to user code. Yep. > That being said, however, you've got to be a privileged user to set > real time very high priority on a thread, and if you do, you'd better > know what you're doing. Any SCHED_FIFO thread should run for a time, > then sleep for a time, or it *will* DOS everything else on the > processor. This is only true if you're not doing what you said in your first paragraph, i.e. running crucial kernel threads higher than any user thread. Chad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 17:33 ` Chad N. Tindel 2005-02-24 22:25 ` Peter Chubb @ 2005-02-24 23:00 ` Andrew Morton 2005-02-24 23:22 ` Chris Friesen 2005-02-25 0:47 ` Kyle Moffett 1 sibling, 2 replies; 33+ messages in thread From: Andrew Morton @ 2005-02-24 23:00 UTC (permalink / raw) To: Chad N. Tindel; +Cc: helge.hafting, linux-kernel "Chad N. Tindel" <chad@tindel.net> wrote: > > I would make the following assertion for any kernel: > > No single userspace thread of execution running on an SMP system should be > able to hose a box by going CPU-bound, bug in the software or no bug. But if we were to enforce that policy, realtime policy would become less useful. You havn't even acknowledged that such a tradeoff exists, let alone demonstrated that we're on the wrong side of it. Here's a quicky which will convert all your kernel threads to SCHED_RR, priority 99. Please test. #!/bin/sh PIDS=$(ps axo pid,command | grep ' \[.*\]$' | sed -e 's/ \[.*\]$//') for i in $PIDS do chrt -r 99 -9 $i done ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 23:00 ` Andrew Morton @ 2005-02-24 23:22 ` Chris Friesen 2005-02-24 23:32 ` Andrew Morton 2005-02-25 0:47 ` Kyle Moffett 1 sibling, 1 reply; 33+ messages in thread From: Chris Friesen @ 2005-02-24 23:22 UTC (permalink / raw) To: Andrew Morton; +Cc: Chad N. Tindel, helge.hafting, linux-kernel Andrew Morton wrote: > #!/bin/sh > > PIDS=$(ps axo pid,command | grep ' \[.*\]$' | sed -e 's/ \[.*\]$//') > > for i in $PIDS > do > chrt -r 99 -9 $i > done For the unaware, "chrt" is part of the schedutils package. (I didn't know about it till just now...figured I'd save others the trouble of searching.) Chris ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 23:22 ` Chris Friesen @ 2005-02-24 23:32 ` Andrew Morton 0 siblings, 0 replies; 33+ messages in thread From: Andrew Morton @ 2005-02-24 23:32 UTC (permalink / raw) To: Chris Friesen; +Cc: chad, helge.hafting, linux-kernel Chris Friesen <cfriesen@nortel.com> wrote: > > Andrew Morton wrote: > > > chrt -r 99 -9 $i Make that chrt -r 99 -p $i ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Xterm Hangs - Possible scheduler defect? 2005-02-24 23:00 ` Andrew Morton 2005-02-24 23:22 ` Chris Friesen @ 2005-02-25 0:47 ` Kyle Moffett 1 sibling, 0 replies; 33+ messages in thread From: Kyle Moffett @ 2005-02-25 0:47 UTC (permalink / raw) To: Andrew Morton; +Cc: helge.hafting, linux-kernel, Chad N. Tindel On Feb 24, 2005, at 18:00, Andrew Morton wrote: > Here's a quicky which will convert all your kernel threads to SCHED_RR, > priority 99. Please test. We have a bunch of workstations here where we run a similar thing during boot, as well as starting a SCHED_RR @ 99 sulogin-type process on tty12. It makes blasting the occasional annoying fork-bomb or CPU-chewing-crashed-X a lot nicer. Cheers, Kyle Moffett -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$ L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+ PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r !y?(-) ------END GEEK CODE BLOCK------ ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2005-02-26 11:55 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050224075756.GA18639@calma.pair.com>
[not found] ` <30111.1109237503@www1.gmx.net>
2005-02-24 17:53 ` Xterm Hangs - Possible scheduler defect? Chad N. Tindel
2005-02-24 18:19 ` Chris Friesen
2005-02-24 18:38 ` Chad N. Tindel
2005-02-24 19:04 ` Paulo Marques
2005-02-24 19:22 ` Chad N. Tindel
2005-02-24 19:46 ` Chris Friesen
2005-02-24 20:08 ` Chad N. Tindel
2005-02-24 20:29 ` Chris Friesen
2005-02-25 0:51 ` Ingo Oeser
2005-02-25 15:12 ` Chris Friesen
2005-02-25 15:39 ` Ingo Oeser
2005-02-25 15:53 ` Paulo Marques
2005-02-25 16:24 ` Lee Revell
2005-02-25 17:07 ` Chris Friesen
2005-02-24 19:52 ` Barry K. Nathan
2005-02-25 20:25 ` Helge Hafting
2005-02-25 21:02 ` Chad N. Tindel
2005-02-25 23:24 ` Lee Revell
2005-02-26 11:58 ` Helge Hafting
2005-02-25 4:25 ` Mike Galbraith
2005-02-23 23:06 Chad N. Tindel
2005-02-24 2:36 ` Andrew Morton
2005-02-24 5:23 ` Chad N. Tindel
2005-02-24 6:50 ` Andrew Morton
2005-02-24 5:26 ` Chad N. Tindel
2005-02-24 13:25 ` Helge Hafting
2005-02-24 17:33 ` Chad N. Tindel
2005-02-24 22:25 ` Peter Chubb
2005-02-24 22:40 ` Chad N. Tindel
2005-02-24 23:00 ` Andrew Morton
2005-02-24 23:22 ` Chris Friesen
2005-02-24 23:32 ` Andrew Morton
2005-02-25 0:47 ` Kyle Moffett
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.