* report a bug about sched_rt
@ 2009-07-24 10:57 sen wang
2009-07-24 12:14 ` Peter Zijlstra
2009-07-24 14:28 ` Arjan van de Ven
0 siblings, 2 replies; 37+ messages in thread
From: sen wang @ 2009-07-24 10:57 UTC (permalink / raw)
To: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel,
linux-kernel
I find something is wrong about sched_rt.
when I am debugging my system with rt_bandwidth_enabled, there is a
running realtime FIFO task in the sched_rt running queue and
the fair running queue is empty. I found the idle task will be
scheduled up when the running task still lie in the sched_rt running
queue!
this will happen when rt runqueue passed it's rt_bandwidth_enabled
runtime,then the scheduler choose the idle task instead of realtime
FIFO task.
the reason lie in: when scheduler try to pick up a realtime FIFO task,
it will check if rt_throttled is enabled,
if so, it'll return and try fair queue but it is empty, then it come
to the sched_idle class.
I don't think it reasonable, we should give the realtime FIFO task the
chance, even when rt runqueue passed it's runtime.
because it is cpu's free time.
To fix it ,and keep rt_bandwidth works as before, I think
pick_next_task_rt() is the best space,
the pick_next_task_rt should check another condiction: rq->cfs.nr_running.
So,I modify pick_next_task_rt() like this and debug it on my omap3430
zoom2 board, it works!
static struct task_struct *pick_next_task_rt(struct rq *rq)
{
struct sched_rt_entity *rt_se;
struct task_struct *p;
struct rt_rq *rt_rq;
...
if (rt_rq_throttled(rt_rq)&& rq->cfs.nr_running)
return NULL;
...
}
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: report a bug about sched_rt 2009-07-24 10:57 report a bug about sched_rt sen wang @ 2009-07-24 12:14 ` Peter Zijlstra 2009-07-24 13:04 ` sen wang 2009-07-24 14:28 ` Arjan van de Ven 1 sibling, 1 reply; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 12:14 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 18:57 +0800, sen wang wrote: > I find something is wrong about sched_rt. > > when I am debugging my system with rt_bandwidth_enabled, there is a > running realtime FIFO task in the sched_rt running queue and > the fair running queue is empty. I found the idle task will be > scheduled up when the running task still lie in the sched_rt running > queue! > > this will happen when rt runqueue passed it's rt_bandwidth_enabled > runtime,then the scheduler choose the idle task instead of realtime > FIFO task. > > the reason lie in: when scheduler try to pick up a realtime FIFO task, > it will check if rt_throttled is enabled, > if so, it'll return and try fair queue but it is empty, then it come > to the sched_idle class. > > I don't think it reasonable, we should give the realtime FIFO task the > chance, even when rt runqueue passed it's runtime. > because it is cpu's free time. > > To fix it ,and keep rt_bandwidth works as before, I think > pick_next_task_rt() is the best space, RT is about determinism, sometimes having some extra time dependent on the runnability of SCHED_OTHER tasks is utterly useless. If you don't like the throttle, disable it. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 12:14 ` Peter Zijlstra @ 2009-07-24 13:04 ` sen wang 2009-07-24 13:14 ` Peter Zijlstra 0 siblings, 1 reply; 37+ messages in thread From: sen wang @ 2009-07-24 13:04 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Linux is used in many fieldes. SCHED_OTHER tasks is important to embedded system. if there is a running state task(a realtime task), how can we shcedule the idle task up? It is ridiculous! since the throttle has a bug, why not fix it? we just modify the codes of checking conditions of picking rt taskes! static struct task_struct *pick_next_task_rt(struct rq *rq) { ... if (rt_rq_throttled(rt_rq)&& rq->cfs.nr_running) return NULL; ... } 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 18:57 +0800, sen wang wrote: >> I find something is wrong about sched_rt. >> >> when I am debugging my system with rt_bandwidth_enabled, there is a >> running realtime FIFO task in the sched_rt running queue and >> the fair running queue is empty. I found the idle task will be >> scheduled up when the running task still lie in the sched_rt running >> queue! >> >> this will happen when rt runqueue passed it's rt_bandwidth_enabled >> runtime,then the scheduler choose the idle task instead of realtime >> FIFO task. >> >> the reason lie in: when scheduler try to pick up a realtime FIFO task, >> it will check if rt_throttled is enabled, >> if so, it'll return and try fair queue but it is empty, then it come >> to the sched_idle class. >> >> I don't think it reasonable, we should give the realtime FIFO task the >> chance, even when rt runqueue passed it's runtime. >> because it is cpu's free time. >> >> To fix it ,and keep rt_bandwidth works as before, I think >> pick_next_task_rt() is the best space, > > RT is about determinism, sometimes having some extra time dependent on > the runnability of SCHED_OTHER tasks is utterly useless. > > If you don't like the throttle, disable it. > > > ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:04 ` sen wang @ 2009-07-24 13:14 ` Peter Zijlstra 2009-07-24 13:26 ` sen wang 0 siblings, 1 reply; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 13:14 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Don't top post -- again! On Fri, 2009-07-24 at 21:04 +0800, sen wang wrote: > Linux is used in many fieldes. SCHED_OTHER tasks is important to > embedded system. Irrelevant. > if there is a running state task(a realtime task), how can we > shcedule the idle task up? Because it ran out of bandwidth. > It is ridiculous! > > since the throttle has a bug, why not fix it? It doesn't have a bug, therefore I won't fix it. The throttle limits the RT tasks to a bandwidth w of u/p. Since real-time scheduling is about determinism a maximum bandwidth larger than the minimum bandwidth specified by w is useless since it cannot be relied upon. Therefore we don't run RT tasks beyond their bandwidth limit. Go read up on scheduling theory. Now you might want a bandwidth of 100% for your RT application (not something I can recommend for the overall health of your machine) in which case you're free to change this setting: echo -1 > /proc/sys/kernel/sched_rt_runtime_us Should do that for you. Also read: Documentation/scheduler/sched-rt-group.txt ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:14 ` Peter Zijlstra @ 2009-07-24 13:26 ` sen wang 2009-07-24 13:33 ` Peter Zijlstra 2009-07-25 11:10 ` Raistlin 0 siblings, 2 replies; 37+ messages in thread From: sen wang @ 2009-07-24 13:26 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel don't tell me what theory. don't be so doctrinairism! OK? If cpu is free and there is a running state task,how can you scdedule idle task up? I tell you again:we are not talking about a bandwidth of 100% for RT! Bug lies in the bandwidth of (100- X)%.(X<100) even in the time of 100-X,if there is a rt task, you should not idle() the system. 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > Don't top post -- again! > > On Fri, 2009-07-24 at 21:04 +0800, sen wang wrote: >> Linux is used in many fieldes. SCHED_OTHER tasks is important to >> embedded system. > > Irrelevant. > >> if there is a running state task(a realtime task), how can we >> shcedule the idle task up? > > Because it ran out of bandwidth. > >> It is ridiculous! >> >> since the throttle has a bug, why not fix it? > > It doesn't have a bug, therefore I won't fix it. > > The throttle limits the RT tasks to a bandwidth w of u/p. > Since real-time scheduling is about determinism a maximum bandwidth > larger than the minimum bandwidth specified by w is useless since it > cannot be relied upon. > > Therefore we don't run RT tasks beyond their bandwidth limit. > > Go read up on scheduling theory. > > Now you might want a bandwidth of 100% for your RT application (not > something I can recommend for the overall health of your machine) in > which case you're free to change this setting: > > echo -1 > /proc/sys/kernel/sched_rt_runtime_us > > Should do that for you. Also read: > > Documentation/scheduler/sched-rt-group.txt > > > ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:26 ` sen wang @ 2009-07-24 13:33 ` Peter Zijlstra 2009-07-24 13:44 ` sen wang 2009-07-25 11:10 ` Raistlin 1 sibling, 1 reply; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 13:33 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 21:26 +0800, sen wang wrote: > don't tell me what theory. don't be so doctrinairism! OK? > If cpu is free and there is a running state task,how can you scdedule > idle task up? > I tell you again:we are not talking about a bandwidth of 100% for RT! > Bug lies in the bandwidth of (100- X)%.(X<100) > even in the time of 100-X,if there is a rt task, you should not idle() > the system. *sigh* Yes we should. I appreciate that you might assume otherwise, but you're wrong. Suppose you have two competing bandwidth groups, which one will run over, to what purpose? Also, your next top post will go to /dev/null. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:33 ` Peter Zijlstra @ 2009-07-24 13:44 ` sen wang 2009-07-24 13:54 ` Peter Zijlstra 0 siblings, 1 reply; 37+ messages in thread From: sen wang @ 2009-07-24 13:44 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 21:26 +0800, sen wang wrote: >> don't tell me what theory. don't be so doctrinairism! OK? >> If cpu is free and there is a running state task,how can you scdedule >> idle task up? >> I tell you again:we are not talking about a bandwidth of 100% for RT! >> Bug lies in the bandwidth of (100- X)%.(X<100) >> even in the time of 100-X,if there is a rt task, you should not idle() >> the system. > > *sigh* > > Yes we should. I appreciate that you might assume otherwise, but you're > wrong. Suppose you have two competing bandwidth groups, which one will > run over, to what purpose? > > Also, your next top post will go to /dev/null. > OK ! maybe you has not understand what I said. It not two competing bandwidth groups. there is a active group and another is empty? How you do? Why not try it by your hand: empty the fair task, run a rt task,enable the bandwidth and see what will happen! In many embedded system,idle task will lead to shutdown something, but the rt task will assume: when it run, idle will not happen! ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:44 ` sen wang @ 2009-07-24 13:54 ` Peter Zijlstra 2009-07-24 14:04 ` sen wang 2009-07-24 14:24 ` sen wang 0 siblings, 2 replies; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 13:54 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 21:44 +0800, sen wang wrote: > 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > > On Fri, 2009-07-24 at 21:26 +0800, sen wang wrote: > >> don't tell me what theory. don't be so doctrinairism! OK? > >> If cpu is free and there is a running state task,how can you scdedule > >> idle task up? > >> I tell you again:we are not talking about a bandwidth of 100% for RT! > >> Bug lies in the bandwidth of (100- X)%.(X<100) > >> even in the time of 100-X,if there is a rt task, you should not idle() > >> the system. > > > > *sigh* > > > > Yes we should. I appreciate that you might assume otherwise, but you're > > wrong. Suppose you have two competing bandwidth groups, which one will > > run over, to what purpose? > > > > Also, your next top post will go to /dev/null. > > > > > OK ! maybe you has not understand what I said. > It not two competing bandwidth groups. there is a active group and > another is empty? > How you do? No, but the 1 group is the trivial case of many groups. Changing the semantics for the trivial case is inconsistent at best, and confusing at worst. > Why not try it by your hand: empty the fair task, run a rt task,enable > the bandwidth and > see what will happen! Oh, I know, I wrote the code. > In many embedded system,idle task will lead to shutdown something, but > the rt task will > assume: when it run, idle will not happen! How is it my problem when you design your system wrong? If you want your 1 RT group to not get throttled, disable the throttle, or adjust it to fit the parameters of your workload. If you don't want idle to have latency impact on your RT tasks, fix your idle behaviour. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:54 ` Peter Zijlstra @ 2009-07-24 14:04 ` sen wang 2009-07-24 14:48 ` Peter Zijlstra 2009-07-24 14:24 ` sen wang 1 sibling, 1 reply; 37+ messages in thread From: sen wang @ 2009-07-24 14:04 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 21:44 +0800, sen wang wrote: >> 2009/7/24 Peter Zijlstra <peterz@infradead.org>: >> > On Fri, 2009-07-24 at 21:26 +0800, sen wang wrote: >> >> don't tell me what theory. don't be so doctrinairism! OK? >> >> If cpu is free and there is a running state task,how can you scdedule >> >> idle task up? >> >> I tell you again:we are not talking about a bandwidth of 100% for RT! >> >> Bug lies in the bandwidth of (100- X)%.(X<100) >> >> even in the time of 100-X,if there is a rt task, you should not idle() >> >> the system. >> > >> > *sigh* >> > >> > Yes we should. I appreciate that you might assume otherwise, but you're >> > wrong. Suppose you have two competing bandwidth groups, which one will >> > run over, to what purpose? >> > >> > Also, your next top post will go to /dev/null. >> > >> >> >> OK ! maybe you has not understand what I said. >> It not two competing bandwidth groups. there is a active group and >> another is empty? >> How you do? > > No, but the 1 group is the trivial case of many groups. Changing the > semantics for the trivial case is inconsistent at best, and confusing at > worst. > >> Why not try it by your hand: empty the fair task, run a rt task,enable >> the bandwidth and >> see what will happen! > > Oh, I know, I wrote the code. > >> In many embedded system,idle task will lead to shutdown something, but >> the rt task will >> assume: when it run, idle will not happen! > > How is it my problem when you design your system wrong? > > If you want your 1 RT group to not get throttled, disable the throttle, > or adjust it to fit the parameters of your workload. If you don't want > idle to have latency impact on your RT tasks, fix your idle behaviour. > > > OK just one question: if cpu is free and there is running state task, how you do? schedule the task up? or schedule idle task up? ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:04 ` sen wang @ 2009-07-24 14:48 ` Peter Zijlstra 2009-07-24 14:53 ` sen wang 2009-07-24 15:07 ` sen wang 0 siblings, 2 replies; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 14:48 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 22:04 +0800, sen wang wrote: > just one question: > if cpu is free and there is running state task, how you do? > schedule the task up? or schedule idle task up? Well, when an RT group is over the bandwidth limit I don't consider them runnable. Therefore, failing to find any other tasks, we run the idle task. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:48 ` Peter Zijlstra @ 2009-07-24 14:53 ` sen wang 2009-07-24 15:07 ` sen wang 1 sibling, 0 replies; 37+ messages in thread From: sen wang @ 2009-07-24 14:53 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 22:04 +0800, sen wang wrote: > >> just one question: >> if cpu is free and there is running state task, how you do? >> schedule the task up? or schedule idle task up? > > Well, when an RT group is over the bandwidth limit I don't consider them > runnable. Therefore, failing to find any other tasks, we run the idle > task. > you consider them runnable. but sorry, what you consider is wrong! ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:48 ` Peter Zijlstra 2009-07-24 14:53 ` sen wang @ 2009-07-24 15:07 ` sen wang 2009-07-24 15:24 ` Peter Zijlstra ` (2 more replies) 1 sibling, 3 replies; 37+ messages in thread From: sen wang @ 2009-07-24 15:07 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 22:04 +0800, sen wang wrote: > >> just one question: >> if cpu is free and there is running state task, how you do? >> schedule the task up? or schedule idle task up? > > Well, when an RT group is over the bandwidth limit I don't consider them > runnable. Therefore, failing to find any other tasks, we run the idle > task. > you havn't anwser the question: if cpu is free, should we schedule the running state task or idle task? face the error and fix it! ok? ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 15:07 ` sen wang @ 2009-07-24 15:24 ` Peter Zijlstra 2009-07-24 15:43 ` sen wang 2009-07-24 15:34 ` Thomas Gleixner 2009-07-25 11:12 ` Raistlin 2 siblings, 1 reply; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 15:24 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 23:07 +0800, sen wang wrote: > 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > > On Fri, 2009-07-24 at 22:04 +0800, sen wang wrote: > > > >> just one question: > >> if cpu is free and there is running state task, how you do? > >> schedule the task up? or schedule idle task up? > > > > Well, when an RT group is over the bandwidth limit I don't consider them > > runnable. Therefore, failing to find any other tasks, we run the idle > > task. > > > > you havn't anwser the question: if cpu is free, should we schedule the > running state task or idle task? It it not runnable because the group is over its limit. > face the error and fix it! ok? Please tone down and re-read the explanations I gave. The throttle is an H-CBS services for RT task groups, meant to provide isolation through a fixed resource guarantee. Any process actually hitting the throttle means a miss configured system -- unless its a temporary overload and you're able to deal with those. The single group case is simply the trivial case thereof. Your proposed change does not generalize to such a framework, and while it might work with the current code, it doesn't serve a use-case considered in this architecture and will render the interface inconsistent. Furthermore, future work in this area will not be able to support your changed semantics in a sane fashion. I've yet to see any coherent explanation of your problem, and quite frankly I find your attitude offensive. As you say, Linux is an open-source effort, and you're free to do with your copy as you see fit (provided you stick to the rules stipulated by the GPLv2). However as co-maintainer of the mainline scheduler I see no reason to entertain your change, nor for that matter to continue this discussion. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 15:24 ` Peter Zijlstra @ 2009-07-24 15:43 ` sen wang 0 siblings, 0 replies; 37+ messages in thread From: sen wang @ 2009-07-24 15:43 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 23:07 +0800, sen wang wrote: >> 2009/7/24 Peter Zijlstra <peterz@infradead.org>: >> > On Fri, 2009-07-24 at 22:04 +0800, sen wang wrote: >> > >> >> just one question: >> >> if cpu is free and there is running state task, how you do? >> >> schedule the task up? or schedule idle task up? >> > >> > Well, when an RT group is over the bandwidth limit I don't consider them >> > runnable. Therefore, failing to find any other tasks, we run the idle >> > task. >> > >> >> you havn't anwser the question: if cpu is free, should we schedule the >> running state task or idle task? > > It it not runnable because the group is over its limit. > >> face the error and fix it! ok? > > Please tone down and re-read the explanations I gave. > > The throttle is an H-CBS services for RT task groups, meant to provide > isolation through a fixed resource guarantee. > > Any process actually hitting the throttle means a miss configured system > -- unless its a temporary overload and you're able to deal with those. > > The single group case is simply the trivial case thereof. > > Your proposed change does not generalize to such a framework, and while > it might work with the current code, it doesn't serve a use-case > considered in this architecture and will render the interface > inconsistent. > > Furthermore, future work in this area will not be able to support your > changed semantics in a sane fashion. > > I've yet to see any coherent explanation of your problem, and quite > frankly I find your attitude offensive. > > As you say, Linux is an open-source effort, and you're free to do with > your copy as you see fit (provided you stick to the rules stipulated by > the GPLv2). However as co-maintainer of the mainline scheduler I see no > reason to entertain your change, nor for that matter to continue this > discussion. > > sorry for my tone, If you feel hurted. I apologize. But I still hold my viewpoint. I just want the 100-x time should be used by running task. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 15:07 ` sen wang 2009-07-24 15:24 ` Peter Zijlstra @ 2009-07-24 15:34 ` Thomas Gleixner 2009-07-25 11:12 ` Raistlin 2 siblings, 0 replies; 37+ messages in thread From: Thomas Gleixner @ 2009-07-24 15:34 UTC (permalink / raw) To: sen wang Cc: Peter Zijlstra, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 24 Jul 2009, sen wang wrote: > 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > > On Fri, 2009-07-24 at 22:04 +0800, sen wang wrote: > > > >> just one question: > >> if cpu is free and there is running state task, how you do? > >> schedule the task up? or schedule idle task up? > > > > Well, when an RT group is over the bandwidth limit I don't consider them > > runnable. Therefore, failing to find any other tasks, we run the idle > > task. > > > > you havn't anwser the question: if cpu is free, should we schedule the > running state task or idle task? Peter explained how it's implemented and why he considers it to be correct and that it can be disabled. > face the error and fix it! ok? Can you please stop yelling at Peter? He politely answered your questions. Have you even thought about his answers before shouting "error" ? Also be aware that you can yell "fix it" as often as you want, all you are achieving is an entry in a couple of /dev/null procmail rules. Please read http://www.tux.org/lkml/#s3-12 before you answer again. Thanks, tglx ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 15:07 ` sen wang 2009-07-24 15:24 ` Peter Zijlstra 2009-07-24 15:34 ` Thomas Gleixner @ 2009-07-25 11:12 ` Raistlin 2 siblings, 0 replies; 37+ messages in thread From: Raistlin @ 2009-07-25 11:12 UTC (permalink / raw) To: sen wang Cc: Peter Zijlstra, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 453 bytes --] On Fri, 2009-07-24 at 23:07 +0800, sen wang wrote: > face the error and fix it! ok? Wow... You're very and reasonable nice guy, aren't you? :-O Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ---------------------------------------------------------------------- Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy) http://blog.linux.it/raistlin / raistlin@ekiga.net / dario.faggioli@jabber.org [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:54 ` Peter Zijlstra 2009-07-24 14:04 ` sen wang @ 2009-07-24 14:24 ` sen wang 2009-07-24 14:48 ` Peter Zijlstra 1 sibling, 1 reply; 37+ messages in thread From: sen wang @ 2009-07-24 14:24 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 21:44 +0800, sen wang wrote: >> 2009/7/24 Peter Zijlstra <peterz@infradead.org>: >> > On Fri, 2009-07-24 at 21:26 +0800, sen wang wrote: >> >> don't tell me what theory. don't be so doctrinairism! OK? >> >> If cpu is free and there is a running state task,how can you scdedule >> >> idle task up? >> >> I tell you again:we are not talking about a bandwidth of 100% for RT! >> >> Bug lies in the bandwidth of (100- X)%.(X<100) >> >> even in the time of 100-X,if there is a rt task, you should not idle() >> >> the system. >> > >> > *sigh* >> > >> > Yes we should. I appreciate that you might assume otherwise, but you're >> > wrong. Suppose you have two competing bandwidth groups, which one will >> > run over, to what purpose? >> > >> > Also, your next top post will go to /dev/null. >> > >> >> >> OK ! maybe you has not understand what I said. >> It not two competing bandwidth groups. there is a active group and >> another is empty? >> How you do? > > No, but the 1 group is the trivial case of many groups. Changing the > semantics for the trivial case is inconsistent at best, and confusing at > worst. yes! 1 group is the trivial case ,but you can't say it is useless. and in some system it is important! I have read across the schedule codes and tried this way,it work: static struct task_struct *pick_next_task_rt(struct rq *rq) { ... if (rt_rq_throttled(rt_rq)&& rq->cfs.nr_running) return NULL; ... } >> Why not try it by your hand: empty the fair task, run a rt task,enable >> the bandwidth and >> see what will happen! > > Oh, I know, I wrote the code. > >> In many embedded system,idle task will lead to shutdown something, but >> the rt task will >> assume: when it run, idle will not happen! > > How is it my problem when you design your system wrong? my system is good. but there is no rules what the idle task will do,so. people always write codes in idle task with the assume: no any running task in the system. and people also always write codes in rt task with the assume: if I am in running state ,system will not idle. so what i said above is some like theory,but I don't like the word “theory". I call it people's common sense. but the behavior of the throttled RT group is changed from people's common sense,so don't say people's common sense is wrong. OK? > > If you want your 1 RT group to not get throttled, disable the throttle, > or adjust it to fit the parameters of your workload. If you don't want > idle to have latency impact on your RT tasks, fix your idle behaviour. > 1 RT is important to me. But I also have fair task, so throttled is also important to me. and don't say : idle have latency impact on RT tasks. It is too ludicrous Why we make intended latency impact by ourselves,by wrong idle task? ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:24 ` sen wang @ 2009-07-24 14:48 ` Peter Zijlstra 2009-07-24 15:02 ` sen wang 2009-07-24 15:40 ` Jamie Lokier 0 siblings, 2 replies; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 14:48 UTC (permalink / raw) To: sen wang Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 22:24 +0800, sen wang wrote: > > No, but the 1 group is the trivial case of many groups. Changing the > > semantics for the trivial case is inconsistent at best, and confusing at > > worst. > yes! 1 group is the trivial case ,but you can't say it is useless. and > in some system > it is important! > I have read across the schedule codes and tried this way,it work: > static struct task_struct *pick_next_task_rt(struct rq *rq) > { ... > if (rt_rq_throttled(rt_rq)&& rq->cfs.nr_running) > return NULL; > .... > } That might work in the current implementation, but like I already explained, its not consistent with the multi-group case. Also, people are working on making it a proper EDF scheduled CBS, it won't generalize. > > How is it my problem when you design your system wrong? > > my system is good. but there is no rules what the idle task will do,so. > people always write codes in idle task with the assume: no any running > task in the system. > and people also always write codes in rt task with the assume: if I am > in running state > ,system will not idle. > > so what i said above is some like theory,but I don't like the word “theory". > I call it people's common sense. > > but the behavior of the throttled RT group is changed from people's > common sense,so don't say people's common sense is wrong. OK? There are plenty of examples where common sense utterly fails, the one that comes to mind is Probability Theory. > > If you want your 1 RT group to not get throttled, disable the throttle, > > or adjust it to fit the parameters of your workload. If you don't want > > idle to have latency impact on your RT tasks, fix your idle behaviour. > > > > 1 RT is important to me. But I also have fair task, so throttled is > also important to me. > and don't say : idle have latency impact on RT tasks. It is too > ludicrous Why we make intended latency impact by ourselves,by wrong > idle task? Yes, configurable idle tasks are nothing new. If you care about wakeup latency then idle=poll is preferred (it sucks for power saving, but such is life). On your embedded board you seem to have a particularly aggressive idle function wrt power savings, which would result in rather large wake from idle latencies, regardless of the bandwidth throttle, so what is the problem? If you're using the bandwidth throttle to control your RT tasks so as not to starve your SCHED_OTHER tasks, then I will call your system ill designed. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:48 ` Peter Zijlstra @ 2009-07-24 15:02 ` sen wang 2009-07-24 15:40 ` Jamie Lokier 1 sibling, 0 replies; 37+ messages in thread From: sen wang @ 2009-07-24 15:02 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel 2009/7/24 Peter Zijlstra <peterz@infradead.org>: > On Fri, 2009-07-24 at 22:24 +0800, sen wang wrote: > >> > No, but the 1 group is the trivial case of many groups. Changing the >> > semantics for the trivial case is inconsistent at best, and confusing at >> > worst. > >> yes! 1 group is the trivial case ,but you can't say it is useless. and >> in some system >> it is important! >> I have read across the schedule codes and tried this way,it work: >> static struct task_struct *pick_next_task_rt(struct rq *rq) >> { ... >> if (rt_rq_throttled(rt_rq)&& rq->cfs.nr_running) >> return NULL; >> .... >> } > > That might work in the current implementation, but like I already > explained, its not consistent with the multi-group case. Also, people > are working on making it a proper EDF scheduled CBS, it won't > generalize. > >> > How is it my problem when you design your system wrong? >> >> my system is good. but there is no rules what the idle task will do,so. >> people always write codes in idle task with the assume: no any running >> task in the system. >> and people also always write codes in rt task with the assume: if I am >> in running state >> ,system will not idle. >> >> so what i said above is some like theory,but I don't like the word “theory". >> I call it people's common sense. >> >> but the behavior of the throttled RT group is changed from people's >> common sense,so don't say people's common sense is wrong. OK? > > There are plenty of examples where common sense utterly fails, the one > that comes to mind is Probability Theory. > >> > If you want your 1 RT group to not get throttled, disable the throttle, >> > or adjust it to fit the parameters of your workload. If you don't want >> > idle to have latency impact on your RT tasks, fix your idle behaviour. >> > >> >> 1 RT is important to me. But I also have fair task, so throttled is >> also important to me. >> and don't say : idle have latency impact on RT tasks. It is too >> ludicrous Why we make intended latency impact by ourselves,by wrong >> idle task? > > Yes, configurable idle tasks are nothing new. If you care about wakeup > latency then idle=poll is preferred (it sucks for power saving, but such > is life). > > On your embedded board you seem to have a particularly aggressive idle > function wrt power savings, which would result in rather large wake from > idle latencies, regardless of the bandwidth throttle, so what is the > problem? > don't guess what i do in my idle? my idle is perfect! and don't think only you understand the scheduling and waht you consider is right. linux is a free world. > If you're using the bandwidth throttle to control your RT tasks so as > not to starve your SCHED_OTHER tasks, then I will call your system ill > designed. > the bandwidth throttle to control RT tasks is useful. of course , I know how not to prevent SCHED_OTHER tasks from being starved. we just discuss how to deal with the 100-X time. and very unfortunatly,you are wrong. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:48 ` Peter Zijlstra 2009-07-24 15:02 ` sen wang @ 2009-07-24 15:40 ` Jamie Lokier 2009-07-24 16:01 ` Peter Zijlstra 1 sibling, 1 reply; 37+ messages in thread From: Jamie Lokier @ 2009-07-24 15:40 UTC (permalink / raw) To: Peter Zijlstra Cc: sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Peter Zijlstra wrote: > If you're using the bandwidth throttle to control your RT tasks so as > not to starve your SCHED_OTHER tasks, then I will call your system ill > designed. What mechanism should be used to avoid starving SCHED_OTHER tasks, in the event there are unforeseen bugs or unpredictable calculation times in an RT task? Thanks, -- Jamie ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 15:40 ` Jamie Lokier @ 2009-07-24 16:01 ` Peter Zijlstra 2009-07-24 23:30 ` Jamie Lokier 2009-07-25 12:19 ` Raistlin 0 siblings, 2 replies; 37+ messages in thread From: Peter Zijlstra @ 2009-07-24 16:01 UTC (permalink / raw) To: Jamie Lokier Cc: sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel On Fri, 2009-07-24 at 16:40 +0100, Jamie Lokier wrote: > Peter Zijlstra wrote: > > If you're using the bandwidth throttle to control your RT tasks so as > > not to starve your SCHED_OTHER tasks, then I will call your system ill > > designed. > > What mechanism should be used to avoid starving SCHED_OTHER tasks, in > the event there are unforeseen bugs or unpredictable calculation times > in an RT task? For bugs the throttle works, like I said a well functioning system is not supposed to hit the throttle, obviously a bug precludes the well functioning qualification :-) Unpredictable calculation times can be dealt with on the application design level, for example using techniques such as outlined here: http://feanor.sssup.it/~faggioli/papers/OSPERT-2009-dlexception.pdf These really are things you should know about before writing an RT application ;-) ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 16:01 ` Peter Zijlstra @ 2009-07-24 23:30 ` Jamie Lokier 2009-07-25 5:22 ` Bill Gatliff 2009-07-25 12:33 ` Raistlin 2009-07-25 12:19 ` Raistlin 1 sibling, 2 replies; 37+ messages in thread From: Jamie Lokier @ 2009-07-24 23:30 UTC (permalink / raw) To: Peter Zijlstra Cc: sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Peter Zijlstra wrote: > On Fri, 2009-07-24 at 16:40 +0100, Jamie Lokier wrote: > > Peter Zijlstra wrote: > > > If you're using the bandwidth throttle to control your RT tasks so as > > > not to starve your SCHED_OTHER tasks, then I will call your system ill > > > designed. > > > > What mechanism should be used to avoid starving SCHED_OTHER tasks, in > > the event there are unforeseen bugs or unpredictable calculation times > > in an RT task? > > For bugs the throttle works, like I said a well functioning system is > not supposed to hit the throttle, obviously a bug precludes the well > functioning qualification :-) > > Unpredictable calculation times can be dealt with on the application > design level, for example using techniques such as outlined here: > > http://feanor.sssup.it/~faggioli/papers/OSPERT-2009-dlexception.pdf > > These really are things you should know about before writing an RT > application ;-) Certainly those things can be used, if you are really serious about RT behaviour. They are quite complex. For simple things like "try to keep the buffer to my DVD writer full" (no I don't know how much CPU that requires - it's a kind of "best effort but try very hard!"), it would be quite useful to have something like RT-bandwidth which grants a certain percentage of time as an RT task, and effectively downgrades it to SCHED_OTHER when that time is exceeded to permit some fairness with the rest of the system. You can do that in userspace using the techniques in the PDF, and I have looked at such techniques many years ago (2.2 days!), but the same could be said about RT-bandwidth. But it's much easier to just set a kernel parameter. -- Jamie ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 23:30 ` Jamie Lokier @ 2009-07-25 5:22 ` Bill Gatliff 2009-07-25 22:48 ` Jamie Lokier 2009-07-25 12:33 ` Raistlin 1 sibling, 1 reply; 37+ messages in thread From: Bill Gatliff @ 2009-07-25 5:22 UTC (permalink / raw) To: Jamie Lokier Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Jamie Lokier wrote: > For simple things like "try to keep the buffer to my DVD writer full" > (no I don't know how much CPU that requires - it's a kind of "best > effort but try very hard!"), it would be quite useful to have > something like RT-bandwidth which grants a certain percentage of time > as an RT task, and effectively downgrades it to SCHED_OTHER when that > time is exceeded to permit some fairness with the rest of the system. > Useful perhaps, but an application design that explicitly communicates your desires to the scheduler will be more robust, even if it does seem more complex at the outset. I'm with Peter on this one. My impression of RT-bandwidth is that you shouldn't ever see it doing anything unless your system contains an error. In those situations, it's definitely a handy alternative to rebooting to get your shell back. But I don't think you want to build a system that depends on it, perhaps for no other reason than the fact that if RT-bandwidth doesn't make your system behave itself then you don't have a Plan B anymore. b.g. -- Bill Gatliff bgat@billgatliff.com ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-25 5:22 ` Bill Gatliff @ 2009-07-25 22:48 ` Jamie Lokier 2009-07-26 2:44 ` Bill Gatliff 0 siblings, 1 reply; 37+ messages in thread From: Jamie Lokier @ 2009-07-25 22:48 UTC (permalink / raw) To: Bill Gatliff Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Bill Gatliff wrote: > Jamie Lokier wrote: > >For simple things like "try to keep the buffer to my DVD writer full" > >(no I don't know how much CPU that requires - it's a kind of "best > >effort but try very hard!"), it would be quite useful to have > >something like RT-bandwidth which grants a certain percentage of time > >as an RT task, and effectively downgrades it to SCHED_OTHER when that > >time is exceeded to permit some fairness with the rest of the system. > > > > Useful perhaps, but an application design that explicitly communicates > your desires to the scheduler will be more robust, even if it does seem > more complex at the outset. I agree with communicting the desire explicitly to the scheduler. In the above example, the exact desire is "give me as much CPU as I ask for, because my hardware servicing will be adversely but non-fatally affected if you don't, and the amount of CPU needed to service the hardware cannot be determined in advance, but prevent me from blocking progress in the rest of the system by limiting my exclusive ownership of the CPU". How do you propose to communicate that to the scheduler, if not by something rather like RT-bandwidth with downgrading to SCHED_OTHER when a policy limit is exceeded? -- Jamie ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-25 22:48 ` Jamie Lokier @ 2009-07-26 2:44 ` Bill Gatliff 2009-07-26 19:03 ` Jamie Lokier 0 siblings, 1 reply; 37+ messages in thread From: Bill Gatliff @ 2009-07-26 2:44 UTC (permalink / raw) To: Jamie Lokier Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Jamie Lokier wrote: > Bill Gatliff wrote: > >> Jamie Lokier wrote: >> >>> For simple things like "try to keep the buffer to my DVD writer full" >>> (no I don't know how much CPU that requires - it's a kind of "best >>> effort but try very hard!"), it would be quite useful to have >>> something like RT-bandwidth which grants a certain percentage of time >>> as an RT task, and effectively downgrades it to SCHED_OTHER when that >>> time is exceeded to permit some fairness with the rest of the system. >>> >>> >> Useful perhaps, but an application design that explicitly communicates >> your desires to the scheduler will be more robust, even if it does seem >> more complex at the outset. >> > > I agree with communicting the desire explicitly to the scheduler. > > In the above example, the exact desire is "give me as much CPU as I > ask for, because my hardware servicing will be adversely but > non-fatally affected if you don't, and the amount of CPU needed to > service the hardware cannot be determined in advance, but prevent me > from blocking progress in the rest of the system by limiting my > exclusive ownership of the CPU". > > How do you propose to communicate that to the scheduler, if not by > something rather like RT-bandwidth with downgrading to SCHED_OTHER > when a policy limit is exceeded? > This is a great real-world problem. And there's no one-size-fits-all answer, unfortunately. RT-bandwidth will give you the system behavior you are after, but it's a pretty blunt instrument. I'd consider putting some throttling in your interrupt handler that prevents it from running more than a certain amount of calculation per interrupt event. And perhaps it's looking at execution timestamps to determine how often it's running, and can therefore do a rough calculation of how much CPU it's eating. At least until threaded interrupt scheduling is widespread, a runaway interrupt handler is definitely an opportunity to hang up a system. Tasklets are nice for this, because the scheduler won't re-queue one if it's already running. So if your interrupt handler's job is just to launch the tasklet, and you know how much time the tasklet takes to run, then if you get a burst of interrupts you don't end up launching an equivalent burst of scheduled work: eventually the interrupt handler overtakes the tasklet, and the additional interrupt events get dropped. That's often a decent way to deal with system overload, especially if it leaves the system functional enough to take some sort of "evasive action" like reverting to polled i/o, issuing a diagnostic message, or doing an orderly transition to a safe mode. A flood ping, lots of paging, and driver bugs are just a few ways you can encounter an unexpected burst of interrupt activity that might, if not dealt with on some level, cause the system to suddenly destabilize. Point is, keep a mentality that you want to fall back onto RT-bandwidth (or any other type of watchdog timer expiration) only after you've exhausted all other options. Pretend it isn't there--- but definitely know what will happen if it ever steps in. A system coded that way is much more resistant to breakage, in my experience anyway. b.g. -- Bill Gatliff bgat@billgatliff.com ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-26 2:44 ` Bill Gatliff @ 2009-07-26 19:03 ` Jamie Lokier 2009-07-27 10:45 ` Peter Zijlstra 2009-07-27 13:35 ` Bill Gatliff 0 siblings, 2 replies; 37+ messages in thread From: Jamie Lokier @ 2009-07-26 19:03 UTC (permalink / raw) To: Bill Gatliff Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Bill Gatliff wrote: > Jamie Lokier wrote: > >I agree with communicting the desire explicitly to the scheduler. > > > >In the above example, the exact desire is "give me as much CPU as I > >ask for, because my hardware servicing will be adversely but > >non-fatally affected if you don't, and the amount of CPU needed to > >service the hardware cannot be determined in advance, but prevent me > >from blocking progress in the rest of the system by limiting my > >exclusive ownership of the CPU". > > > >How do you propose to communicate that to the scheduler, if not by > >something rather like RT-bandwidth with downgrading to SCHED_OTHER > >when a policy limit is exceeded? > > This is a great real-world problem. And there's no one-size-fits-all > answer, unfortunately. > > RT-bandwidth will give you the system behavior you are after, but it's a > pretty blunt instrument. I'm under the impression that RT-bandwidth will *not* give the above system behaviour, and that is the whole reason for this thread. > I'd consider putting some throttling in your interrupt handler that > prevents it from running more than a certain amount of calculation per > interrupt event. There is no interrupt handler in my specification above... > And perhaps it's looking at execution timestamps to > determine how often it's running, and can therefore do a rough > calculation of how much CPU it's eating. At least until threaded > interrupt scheduling is widespread, a runaway interrupt handler is > definitely an opportunity to hang up a system. With threaded interrupt scheduling using RT priority, that opportunity to hang the system is exactly the same. Indeed, threaded interrupts are a good example of when you might want a limit fraction of the CPU allocated to that thread at RT priority, falling down to SCHED_OTHER if the handler needs to continue to run. That is, in fact, how > Tasklets tasklets, bottom halves and things like that work :-) [snip explanation of tasklets] > That's often a decent way to deal with system overload, especially if it > leaves the system functional enough to take some sort of "evasive > action" like reverting to polled i/o, issuing a diagnostic message, or > doing an orderly transition to a safe mode. Polled I/O is good when this happens. You can revert to polled I/O automatically without coding it explicitly in interrupt handlers, if the scheduler provides appropriate support. When a threaded interrupt (with RT priority, naturally) is run too often, then you stop scheduling it as RT and bring it down to SCHED_OTHER or lower, periodically allowing it to have a fair share of the CPU when there are other runnable tasks. That's quite close to polling I/O, without coding it explicitly in the device driver. So RT-bandwidth would be nice for those threaded interrupts. -- Jamie ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-26 19:03 ` Jamie Lokier @ 2009-07-27 10:45 ` Peter Zijlstra 2009-07-27 13:35 ` Bill Gatliff 1 sibling, 0 replies; 37+ messages in thread From: Peter Zijlstra @ 2009-07-27 10:45 UTC (permalink / raw) To: Jamie Lokier Cc: Bill Gatliff, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel, Tommaso Cucinotta On Sun, 2009-07-26 at 20:03 +0100, Jamie Lokier wrote: > So RT-bandwidth would be nice for those threaded interrupts. No, a different/better scheduling policy would be - maybe. People mentioned SCHED_SPORADIC, but I really really dislike that because for the actual sporadic task model we can do so much better using deadline schedulers. Furthermore, SCHED_SPORADIC as specified by POSIX is a useless piece of crap, so we would have to deviate from POSIX, which would create confusion -- although good documentation might help a little here. The current RT-bandwidth comes from the RT cgroup code, and its only purpose in life is to provide isolation between multiple groups through guaranteeing the bandwidth of others by hard limiting. It does that. It's certainly not flawless, in fact its not what I would call complete (hence its still EXPERIMENTAL status), but Fabio is working on implementing a deadline H-CBS for this, which would greatly improve the situation. Extending the deadline model with a soft mode might be useful as mentioned by Tommaso, but I would only be looking at that after we've completed work on the normal deadline bits (both group and task). And then we'd have to consistently and full integrate it with both. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-26 19:03 ` Jamie Lokier 2009-07-27 10:45 ` Peter Zijlstra @ 2009-07-27 13:35 ` Bill Gatliff 1 sibling, 0 replies; 37+ messages in thread From: Bill Gatliff @ 2009-07-27 13:35 UTC (permalink / raw) To: Jamie Lokier Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Jamie Lokier wrote: > Bill Gatliff wrote: > >> Jamie Lokier wrote: >> >>> I agree with communicting the desire explicitly to the scheduler. >>> >>> In the above example, the exact desire is "give me as much CPU as I >>> ask for, because my hardware servicing will be adversely but >>> non-fatally affected if you don't, and the amount of CPU needed to >>> service the hardware cannot be determined in advance, but prevent me >>> >> >from blocking progress in the rest of the system by limiting my >> >>> exclusive ownership of the CPU". >>> >>> How do you propose to communicate that to the scheduler, if not by >>> something rather like RT-bandwidth with downgrading to SCHED_OTHER >>> when a policy limit is exceeded? >>> >> This is a great real-world problem. And there's no one-size-fits-all >> answer, unfortunately. >> >> RT-bandwidth will give you the system behavior you are after, but it's a >> pretty blunt instrument. >> > > I'm under the impression that RT-bandwidth will *not* give the above > system behaviour, and that is the whole reason for this thread. > I think I misspoke. What I meant to say is that RT-bandwidth will (probably) prevent the hardware handler from eating 100% of the CPU. But the system will suffer quite a, um, "discontinuity" when the throttling happens. > >> I'd consider putting some throttling in your interrupt handler that >> prevents it from running more than a certain amount of calculation per >> interrupt event. >> > > There is no interrupt handler in my specification above... > True. But in practice, I think such devices are typically interrupt-driven at some level. b.g. -- Bill Gatliff bgat@billgatliff.com ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 23:30 ` Jamie Lokier 2009-07-25 5:22 ` Bill Gatliff @ 2009-07-25 12:33 ` Raistlin 2009-07-25 14:58 ` Tommaso Cucinotta 1 sibling, 1 reply; 37+ messages in thread From: Raistlin @ 2009-07-25 12:33 UTC (permalink / raw) To: Jamie Lokier Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel, Tommaso Cucinotta [-- Attachment #1: Type: text/plain, Size: 3013 bytes --] On Sat, 2009-07-25 at 00:30 +0100, Jamie Lokier wrote: > > Unpredictable calculation times can be dealt with on the application > > design level, for example using techniques such as outlined here: > > > > http://feanor.sssup.it/~faggioli/papers/OSPERT-2009-dlexception.pdf > > > > These really are things you should know about before writing an RT > > application ;-) > > Certainly those things can be used, if you are really serious about RT > behaviour. Well... True... But not that much! :-) > They are quite complex. > Agree... But it's just at its start, still working and trying to improve it... > For simple things like "try to keep the buffer to my DVD writer full" > (no I don't know how much CPU that requires - it's a kind of "best > effort but try very hard!"), it would be quite useful to have > something like RT-bandwidth which grants a certain percentage of time > as an RT task, and effectively downgrades it to SCHED_OTHER when that > time is exceeded to permit some fairness with the rest of the system. > Well, agree, again. If you want something very useful, you need the combination of the two: user space techniques and kernel space support. The mechanism described in the paper, work at its best if ran on top of the proper scheduling policies/framework... And the rt-throttling mechanism which is currently in place --or some improvements of it-- could definitely be one of those. > You can do that in userspace using the techniques in the PDF, and I > have looked at such techniques many years ago (2.2 days!), but the > same could be said about RT-bandwidth. But it's much easier to just > set a kernel parameter. > The aim of the mechanism was not to move RT to userspace, forgetting about kernel support... Believe me: we are way far from that! :-) As said, the point is trying to provide the user to specify some --typically-- real-time characteristics of its apps, and have them enforced somehow. I don't think comparing kernel-space throttling with our user space deadline/wcet violation notification is the right thing to do, since they have very different objective, actually! Throttling is aimed at limiting the bandwidth of real-time apps (or groups of them) without the need of them to be aware of that. Our exception based mechanism is aimed at giving the application developer the capability of being aware of exactly such! So, different tools for different goals, I think, which however could work together, if needed... I hope it would not seem I'm trying to push our mechanism over anything... Just trying to clarify a little bit why we conceived it and how it works. :-) Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ---------------------------------------------------------------------- Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy) http://blog.linux.it/raistlin / raistlin@ekiga.net / dario.faggioli@jabber.org [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-25 12:33 ` Raistlin @ 2009-07-25 14:58 ` Tommaso Cucinotta 0 siblings, 0 replies; 37+ messages in thread From: Tommaso Cucinotta @ 2009-07-25 14:58 UTC (permalink / raw) To: Raistlin Cc: Jamie Lokier, Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel Hi all, Raistlin ha scritto: >> For simple things like "try to keep the buffer to my DVD writer full" >> (no I don't know how much CPU that requires - it's a kind of "best >> effort but try very hard!"), it would be quite useful to have >> something like RT-bandwidth which grants a certain percentage of time >> as an RT task, and effectively downgrades it to SCHED_OTHER when that >> time is exceeded to permit some fairness with the rest of the system > Well, agree, again. If you want something very useful, you need the > combination of the two: user space techniques and kernel space support. > I didn't follow the entire discussion, however I'd like to add a comment, if it may be of any help. What is useful actually depends on the usage scenario and its requirements, comprising for example real-time and security requirements. On one hand, giving a real-time task the opportunity to keep running even if its budget is exhausted may be of course useful for the real-time task. In fact, in the real-time literature, you can find the term "soft reservations" to denote those real-time scheduling mechanisms that have such a property (and still preserve theoretical schedulability), with various different ways of distributing the spare capacity on the real-time tasks. On a GPOS like Linux, it may also be useful to "downgrade" a RT task to SCHED_OTHER when its budget is exhausted. In fact, in the AQuoSA EDF-based scheduler [academic], if the flag "SOFT_SERVER" is specified when creating a server, this is exactly what happens :-). On a related note, in the POSIX SPORADIC_SERVER (and e.g., its implementation by Dario Faggioli) there is a "low priority" field specifying the priority at which the task should run when the budget is exhausted). However, if you depart from the traditional "embedded" context (i.e., for industrial control), switching for example to a "multi-user server" context, then a task "triggering" the throttling might not constitute necessarily a system bug that "needs a reboot", but it may simply be due to an application trying to over-use the system as compared to how much it is supposed to use it. Imagine a "pay-per-compute" context in which the share of a server is granted to a user (i.e., to a VM). Then, a provider would not necessarily want to grant a user more computation capability than the user has paid for. In fact, in the AQuoSA scheduler [again, academic], an access-control model exists by which the sys-admin may decide what users (and user groups) are authorized to access the "SOFT_SERVER" facility (i.e., real-time reservations for "gold" users might be allowed to be soft, but the ones for "bronze" users might not). Therefore, IMHO there is no "silver bullet", but what behavior is best depends on the security requirements that may be in-place. Access to the "soft server" mentioned above is just an example, but plenty of other issues may arise, including: maximum system capacity that users may be authorized to occupy, maximum RT server periods that users may be authorized to use (for not starving the background OS for too much), minimum RT server period (for not causing too much scheduling overhead), etc... A more detailed discussion about security requirements arising when granting real-time facilities to unprivileged users on a GPOS may be found in [1], in case anyone is interested. Regards, T. [1] Tommaso Cucinotta "Access Control for Adaptive Reservations on Multi-User Systems", in Proceedings of the 14th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2008), St. Louis, MO, United States, April 2008, available at: http://feanor.sssup.it/~tommaso/publications/RTAS-2008.pdf -- Tommaso Cucinotta, Computer Engineering PhD, Researcher ReTiS Lab, Scuola Superiore Sant'Anna, Pisa, Italy Tel +39 050 882 024, Fax +39 050 882 003 http://feanor.sssup.it/~tommaso ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 16:01 ` Peter Zijlstra 2009-07-24 23:30 ` Jamie Lokier @ 2009-07-25 12:19 ` Raistlin 2009-07-25 22:54 ` Jamie Lokier 1 sibling, 1 reply; 37+ messages in thread From: Raistlin @ 2009-07-25 12:19 UTC (permalink / raw) To: Peter Zijlstra Cc: Jamie Lokier, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel, Tommaso Cucinotta [-- Attachment #1: Type: text/plain, Size: 1942 bytes --] On Fri, 2009-07-24 at 18:01 +0200, Peter Zijlstra wrote: > For bugs the throttle works, like I said a well functioning system is > not supposed to hit the throttle, obviously a bug precludes the well > functioning qualification :-) > Yes, I also think a bandwidth isolation/throttling mechanism could help a lot either with bugs or when you need hard real-time, soft real-time and non real-time applications to live together in one single system such as Linux is --or is about to become. > Unpredictable calculation times can be dealt with on the application > design level, for example using techniques such as outlined here: > > http://feanor.sssup.it/~faggioli/papers/OSPERT-2009-dlexception.pdf > Thanks Peter! :-) We're getting more citation in this ML than in 'our' academic world... I'm not sure it is useful for our PhD and research career, but, indeed, I like that very much anyway! :-P The mechanism proposed in that paper is one way for providing developers with the capability of specifying some typical real time "attributes" of an application (or part of it), such as deadline and/or expected (worst case?) execution time. It is probably not always the best way of doing, but it's something we think it could be useful somewhere. Therefore, we are still working on it, e.g., improving timer resolution, adding the support for new semantic and programming models, etc. Moreover, we are open to any suggestion and contribution about this work, especially from the community! > These really are things you should know about before writing an RT > application ;-) :-D Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ---------------------------------------------------------------------- Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy) http://blog.linux.it/raistlin / raistlin@ekiga.net / dario.faggioli@jabber.org [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-25 12:19 ` Raistlin @ 2009-07-25 22:54 ` Jamie Lokier 2009-07-25 23:24 ` Tommaso Cucinotta 0 siblings, 1 reply; 37+ messages in thread From: Jamie Lokier @ 2009-07-25 22:54 UTC (permalink / raw) To: Raistlin Cc: Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel, Tommaso Cucinotta Raistlin wrote: > > http://feanor.sssup.it/~faggioli/papers/OSPERT-2009-dlexception.pdf > > It is probably not always the best way of doing, but it's something we > think it could be useful somewhere. Therefore, we are still working on > it, e.g., improving timer resolution, adding the support for new > semantic and programming models, etc. Moreover, we are open to any > suggestion and contribution about this work, especially from the > community! The biggest weakness I see is if the application has a bug such as overwriting random memory or terminating the thread which is receiving timer signals, it can easily break the scheduling policy by accident. When the scheduling policy is implemented in the kernel, it can only be broken by system calls requesting a change of scheduling policy, which are relatively unlikely, and if necessary that can be completely prevented by security controls. -- Jamie ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-25 22:54 ` Jamie Lokier @ 2009-07-25 23:24 ` Tommaso Cucinotta 0 siblings, 0 replies; 37+ messages in thread From: Tommaso Cucinotta @ 2009-07-25 23:24 UTC (permalink / raw) To: Jamie Lokier Cc: Raistlin, Peter Zijlstra, sen wang, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel, Tommaso Cucinotta Hi, Jamie Lokier ha scritto: > Raistlin wrote: > >>> http://feanor.sssup.it/~faggioli/papers/OSPERT-2009-dlexception.pdf >>> >> Moreover, we are open to any >> suggestion and contribution about this work, especially from the >> community! >> > > The biggest weakness I see is if the application has a bug such as > overwriting random memory or terminating the thread which is receiving > timer signals, it can easily break the scheduling policy by accident. > > When the scheduling policy is implemented in the kernel, it can only > be broken by system calls requesting a change of scheduling policy, > which are relatively unlikely, and if necessary that can be completely > prevented by security controls. > First, thanks for your comments and interest. The mentioned mechanism should be regarded as something that helps in the development of programs that need to meet timing requirements. This is not meant at all to constitute a "user-level" scheduler, nor to replace a real-time scheduler job. Contrarily, its purpose is solely to push towards a software design paradigm in which the "awareness" of the existing timing constraints, and the "awareness" of the possibility that they might be violated (on a GPOS) is coded at the program-level (by means of an exception-like paradigm). The mechanism has been designed as a complement to the real-time scheduling facilities that the kernel provides. As an example, imagine that, by a proper configuration of the scheduling policy and parameters, an application may be guaranteed a certain budget every period. However, the requested budget cannot be the actual worst-case, because it would not be practical to compute it (nor feasible, cause it would depend on a lot of external factors such as interrupt load etc...) and it would lead to too much under-usage of resources. By using this exception-like paradigm, the developer may code, into an "exception-handler" segment, the recovery actions needed whenever the real-time task is about to violate for example its WCET constraints (for example, one could use a "try_wcet()" block with a WCET specs which is slightly lower than the budget configured into the scheduler, i.e., the difference being the WCET of the exception handling code). Then, the real-time scheduler will still enforce the configured budget for this application, so, if the recovery logics embedded inside the exception-handler takes too much (i.e., its WCET has been under-estimated), then the temporal isolation is guaranteed anycase, and the application won't impact negatively other applications' real-time guarantees provided by the scheduler. In fact, we're working on a practical case-study where both the timing-exception mechanism and one of the real-time schedulers we have here at SSSA is used. For example, a potential issue we're thinking about is if and how to "synchronize" someway the vision of time of the exception-based mechanism and the one of the scheduler, because in prior experiences built modifying multimedia applications for taking advantage of feedback-based real-time scheduling, this was one of the burden to face with, before the application started behaving as "theoretically" foreseen. Hope this clarifies our view. Please, feel free to post further comments. Regards, T. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 13:26 ` sen wang 2009-07-24 13:33 ` Peter Zijlstra @ 2009-07-25 11:10 ` Raistlin [not found] ` <454c71700907250429i1c77658bt6d65b02f08a29f4a@mail.gmail.com> 1 sibling, 1 reply; 37+ messages in thread From: Raistlin @ 2009-07-25 11:10 UTC (permalink / raw) To: sen wang Cc: Peter Zijlstra, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 917 bytes --] On Fri, 2009-07-24 at 21:26 +0800, sen wang wrote: > If cpu is free and there is a running state task,how can you scdedule > idle task up? > Well, if you drop an eye to what Peter is trying to point out, you'll find a lot of examples where providing an RT application with _more_ CPU than it asks, lead to catastrophic consequences... They are some of what we call "scheduling anomalies", and there are plenty of examples of that! :-O I think sched_rt determinism could be improved... But giving some random task some random more bandwidth is just going in the opposite direction! :-( Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ---------------------------------------------------------------------- Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy) http://blog.linux.it/raistlin / raistlin@ekiga.net / dario.faggioli@jabber.org [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
[parent not found: <454c71700907250429i1c77658bt6d65b02f08a29f4a@mail.gmail.com>]
* Re: report a bug about sched_rt [not found] ` <454c71700907250429i1c77658bt6d65b02f08a29f4a@mail.gmail.com> @ 2009-07-25 23:01 ` Jamie Lokier 0 siblings, 0 replies; 37+ messages in thread From: Jamie Lokier @ 2009-07-25 23:01 UTC (permalink / raw) To: sen wang Cc: Raistlin, Peter Zijlstra, mingo, akpm, kernel, npiggin, arjan, linux-arm-kernel, linux-kernel sen wang wrote: > but, to the realtime system like a decoder, I think we should meet the rt > task as we can. If the realtime task is something like a video decoder feeding a display, and it is bandwidth-throttled only to ensure things like SSH and filesystem I/O are still available, then I have to agree with Sen, you would want any "spare" CPU to go the video decoder in that application, not the idle task. > the RT scheduler is mainly used for realtime system which should have the > different policy from fair task. The RT scheduler is used for lots of different systems which you haven't considered. For your application, probably the way RT-bandwidth works is not useful. It's better for some other applications. -- Jamie ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 10:57 report a bug about sched_rt sen wang 2009-07-24 12:14 ` Peter Zijlstra @ 2009-07-24 14:28 ` Arjan van de Ven 2009-07-26 3:55 ` sen wang 1 sibling, 1 reply; 37+ messages in thread From: Arjan van de Ven @ 2009-07-24 14:28 UTC (permalink / raw) To: sen wang; +Cc: mingo, akpm, kernel, npiggin, linux-arm-kernel, linux-kernel On Fri, 24 Jul 2009 18:57:35 +0800 sen wang <wangsen.linux@gmail.com> wrote: > I find something is wrong about sched_rt. > > when I am debugging my system with rt_bandwidth_enabled, there is a > running realtime FIFO task in the sched_rt running queue and > the fair running queue is empty. I found the idle task will be > scheduled up when the running task still lie in the sched_rt running > queue! > > this will happen when rt runqueue passed it's rt_bandwidth_enabled > runtime,then the scheduler choose the idle task instead of realtime > FIFO task. > > the reason lie in: when scheduler try to pick up a realtime FIFO task, > it will check if rt_throttled is enabled, > if so, it'll return and try fair queue but it is empty, then it come > to the sched_idle class. > > I don't think it reasonable, we should give the realtime FIFO task the > chance, even when rt runqueue passed it's runtime. > because it is cpu's free time. sounds like a good power limiting feature... -- Arjan van de Ven Intel Open Source Technology Centre For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: report a bug about sched_rt 2009-07-24 14:28 ` Arjan van de Ven @ 2009-07-26 3:55 ` sen wang 0 siblings, 0 replies; 37+ messages in thread From: sen wang @ 2009-07-26 3:55 UTC (permalink / raw) To: Arjan van de Ven; +Cc: akpm, linux-arm-kernel, linux-kernel 2009/7/24 Arjan van de Ven <arjan@infradead.org> > > On Fri, 24 Jul 2009 18:57:35 +0800 > sen wang <wangsen.linux@gmail.com> wrote: > > > I find something is wrong about sched_rt. > > > > when I am debugging my system with rt_bandwidth_enabled, there is a > > running realtime FIFO task in the sched_rt running queue and > > the fair running queue is empty. I found the idle task will be > > scheduled up when the running task still lie in the sched_rt running > > queue! > > > > this will happen when rt runqueue passed it's rt_bandwidth_enabled > > runtime,then the scheduler choose the idle task instead of realtime > > FIFO task. > > > > the reason lie in: when scheduler try to pick up a realtime FIFO task, > > it will check if rt_throttled is enabled, > > if so, it'll return and try fair queue but it is empty, then it come > > to the sched_idle class. > > > > I don't think it reasonable, we should give the realtime FIFO task the > > chance, even when rt runqueue passed it's runtime. > > because it is cpu's free time. > > > sounds like a good power limiting feature... > > what I want to say is: If we give cpu to rt task at that situation,the normal fair task still have chance to get cpu in the left 50ms.(say the throttle is 950 ms).because in the left 50ms,throttle is still enabled, every tick,the rt schedule will check the normal fair task.if find some one,rt will yield cpu to the new coming noraml fair task. for realtime system there is stll normal fair task, so the bandwidth is useful,we cann't just turn it off simply. I think the usespace schedule policy is not feasible. because when and who and where to implemnt it?glibc,uclibc,android bionic.? can you make sure they will implent the compatible policy? kernel is the best place to do it. and since linux is used for so many fields,why not provide different feature for different system.by adding a new kenrel config item,say CONFIG_REAL_TIME_SYSTEM, we can give cpu to rt task when throttle is on,by disable CONFIG_REAL_TIME_SYSTEM,everything is keep untouched for server and desktop. i don't know if the viewpoint will offsend somebody, sorry first :) > -- > Arjan van de Ven Intel Open Source Technology Centre > For development, discussion and tips for power savings, > visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2009-07-27 13:35 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-24 10:57 report a bug about sched_rt sen wang
2009-07-24 12:14 ` Peter Zijlstra
2009-07-24 13:04 ` sen wang
2009-07-24 13:14 ` Peter Zijlstra
2009-07-24 13:26 ` sen wang
2009-07-24 13:33 ` Peter Zijlstra
2009-07-24 13:44 ` sen wang
2009-07-24 13:54 ` Peter Zijlstra
2009-07-24 14:04 ` sen wang
2009-07-24 14:48 ` Peter Zijlstra
2009-07-24 14:53 ` sen wang
2009-07-24 15:07 ` sen wang
2009-07-24 15:24 ` Peter Zijlstra
2009-07-24 15:43 ` sen wang
2009-07-24 15:34 ` Thomas Gleixner
2009-07-25 11:12 ` Raistlin
2009-07-24 14:24 ` sen wang
2009-07-24 14:48 ` Peter Zijlstra
2009-07-24 15:02 ` sen wang
2009-07-24 15:40 ` Jamie Lokier
2009-07-24 16:01 ` Peter Zijlstra
2009-07-24 23:30 ` Jamie Lokier
2009-07-25 5:22 ` Bill Gatliff
2009-07-25 22:48 ` Jamie Lokier
2009-07-26 2:44 ` Bill Gatliff
2009-07-26 19:03 ` Jamie Lokier
2009-07-27 10:45 ` Peter Zijlstra
2009-07-27 13:35 ` Bill Gatliff
2009-07-25 12:33 ` Raistlin
2009-07-25 14:58 ` Tommaso Cucinotta
2009-07-25 12:19 ` Raistlin
2009-07-25 22:54 ` Jamie Lokier
2009-07-25 23:24 ` Tommaso Cucinotta
2009-07-25 11:10 ` Raistlin
[not found] ` <454c71700907250429i1c77658bt6d65b02f08a29f4a@mail.gmail.com>
2009-07-25 23:01 ` Jamie Lokier
2009-07-24 14:28 ` Arjan van de Ven
2009-07-26 3:55 ` sen wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox