* Bug in scheduler when using rt_mutex @ 2011-01-17 14:42 Onkalo Samu 2011-01-17 15:00 ` Peter Zijlstra 2011-01-17 16:00 ` Peter Zijlstra 0 siblings, 2 replies; 42+ messages in thread From: Onkalo Samu @ 2011-01-17 14:42 UTC (permalink / raw) To: mingo, peterz; +Cc: Onkalo Samu.P, linux-kernel@vger.kernel.org Hi I believe that there are some problems in the scheduling when the following happens: - Normal priority process locks rt_mutex and sleeps while keeping it locked. - RT priority process blocks on the rt_mutex while normal priority process is sleeping This sequence can occur with I2C access when both normal priority thread and irq-thread access the same I2C bus. I2C core contains rt_mutex and I2C drivers can sleep with wait_for_completion. I have seen following failure to happen (also with 2.6.37): User process access some device handle or sysfs entry which finally makes an I2C access. I2C core contains rt_mutex protection against parallel access. Sometimes when the rt_mutex is unlocked, user process is not running for a long time (several minutes). This can occur when there are only small number of user processes running. In my test cases there was only cat /dev/zero > /dev/null running at the background and other process was accessing sysfs entry. Example: cat /dev/zero > /dev/null & while [ 1 ] ; do cat /sys/devices/platform/lis3lv02d/selftest done Selftest causes I2C accesses from both user process and irq-thread. Based on my debugging following sequence occurs (single CPU system): 1) There is some user process running at the background (like cat /dev/zero..) 2) User process reads sysfs entry which causes I2C acccess 3) User process locks rt_mutex in the I2C-core 4) User process sleeps while it keeps rt_mutex locked (wait_for_completion in I2C transfer function) 5) irq-thread is kicked to run 6) irq-thread tries to take rt_mutex which is allready locked by user process 7) sleeping user process is promoted to irq-thread priority (RT class) 8) user process is woken up by completion mechanism and it finishes its job 9) user process unlocks rt_mutex and is changed back to old priority and scheduling class 10) irq-thread continues as expected User process is stucked to at phase 9. Scheduler may skip that process for a long time. Based on my analysis vruntime calculations fails for the user process. At phase 9, vruntime for that sched_entity is much bigger compared other processes which leads to situation that it is not scheduled for a long time. Problem is that at phase 7) user process is sleeping and the rt_mutex priority change control is done for the sleeping task. se.vruntime is not modified and when the user process continues running se.vruntime contains about twice the cfs_rq.min_runtime value. Success case: - user process locks rt_mutex - irq-thread causes user process to be promoted to RT level while the user process is in the running and "on_rq == 1" state -> dequeue_task is called which modifies se.vruntime dequeue_entity function: if (!(flags & DEQUEUE_SLEEP)) se->vruntime -= cfs_rq->min_vruntime; When the process is moved back from rt to normal priority enqueue_task updates vruntime again to correct value: enqueue_entity: if (!(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_WAKING)) se->vruntime += cfs_rq->min_vruntime; Failure case: - user process locks rt_mutex - and goes to sleep (wait_for_completion etc.) - user process is dequeued to sleep state -> vruntime is not updated in dequeue_entity - irq-thread blocks to rt_mutex and user process is promoted to RT priory - User process wakes up and continues until it releases rt_mutex -> User process is moved from rt-queue to cfs queue. WAKEUP / WAKING flags are not set so vruntime is updated to incorrect value. I have a simple dummy-driver which demonstrates the case. It is tested with single CPU embedded system on 2.6.37. I also have correction proposal, but it is quite possible that there is better way to do this and it may be that I miss some case totally. Scheduler is quite complex thing. I'll send patches for the test case and for the proposal. Br, Samu Onkalo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-17 14:42 Bug in scheduler when using rt_mutex Onkalo Samu @ 2011-01-17 15:00 ` Peter Zijlstra 2011-01-17 15:15 ` samu.p.onkalo 2011-01-17 16:00 ` Peter Zijlstra 1 sibling, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2011-01-17 15:00 UTC (permalink / raw) To: samu.p.onkalo; +Cc: mingo, linux-kernel@vger.kernel.org, tglx On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: > Hi > > I believe that there are some problems in the scheduling when > the following happens: > - Normal priority process locks rt_mutex and sleeps while keeping it > locked. There's your fail, don't do that! > - RT priority process blocks on the rt_mutex while normal priority > process is sleeping > > This sequence can occur with I2C access when both normal priority > thread and irq-thread access the same I2C bus. I2C core > contains rt_mutex and I2C drivers can sleep with wait_for_completion. Why does I2C core use rt_mutex, that's utterly broken. > Based on my debugging following sequence occurs (single CPU > system): > > 1) There is some user process running at the background (like > cat /dev/zero..) > 2) User process reads sysfs entry which causes I2C acccess > 3) User process locks rt_mutex in the I2C-core > 4) User process sleeps while it keeps rt_mutex locked > (wait_for_completion in I2C transfer function) That's where things go wrong, there's absolutely nothing you can do to fix the system once you block while holding a mutex. ^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: Bug in scheduler when using rt_mutex 2011-01-17 15:00 ` Peter Zijlstra @ 2011-01-17 15:15 ` samu.p.onkalo 2011-01-17 15:28 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: samu.p.onkalo @ 2011-01-17 15:15 UTC (permalink / raw) To: peterz; +Cc: mingo, linux-kernel, tglx [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1996 bytes --] >-----Original Message----- >From: ext Peter Zijlstra [mailto:peterz@infradead.org] >Sent: 17 January, 2011 17:00 >To: Onkalo Samu.P (Nokia-MS/Tampere) >Cc: mingo@elte.hu; linux-kernel@vger.kernel.org; tglx >Subject: Re: Bug in scheduler when using rt_mutex > >On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: >> Hi >> >> I believe that there are some problems in the scheduling when >> the following happens: >> - Normal priority process locks rt_mutex and sleeps while keeping it >> locked. > >There's your fail, don't do that! So that is forbidden: rt_mutex_lock(); wait_for_completion(); <--- shared HW finishes its job rt_mutex_unlock(); > >> - RT priority process blocks on the rt_mutex while normal priority >> process is sleeping >> >> This sequence can occur with I2C access when both normal priority >> thread and irq-thread access the same I2C bus. I2C core >> contains rt_mutex and I2C drivers can sleep with wait_for_completion. > >Why does I2C core use rt_mutex, that's utterly broken. To get low priority task finish ongoing I2C access in time under heavy load cases I think. > >> Based on my debugging following sequence occurs (single CPU >> system): >> >> 1) There is some user process running at the background (like >> cat /dev/zero..) >> 2) User process reads sysfs entry which causes I2C acccess >> 3) User process locks rt_mutex in the I2C-core >> 4) User process sleeps while it keeps rt_mutex locked >> (wait_for_completion in I2C transfer function) > >That's where things go wrong, there's absolutely nothing you can do to >fix the system once you block while holding a mutex. Of course other processes are waiting until the (rt_)mutex is unlocked. Problem is that after the rt_mutex_unlock is done, the task which just released the lock, may be in some non-running state for minutes. -Samu ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: Bug in scheduler when using rt_mutex 2011-01-17 15:15 ` samu.p.onkalo @ 2011-01-17 15:28 ` Peter Zijlstra 0 siblings, 0 replies; 42+ messages in thread From: Peter Zijlstra @ 2011-01-17 15:28 UTC (permalink / raw) To: samu.p.onkalo; +Cc: mingo, linux-kernel, tglx On Mon, 2011-01-17 at 15:15 +0000, samu.p.onkalo@nokia.com wrote: > > >-----Original Message----- > >From: ext Peter Zijlstra [mailto:peterz@infradead.org] > >Sent: 17 January, 2011 17:00 > >To: Onkalo Samu.P (Nokia-MS/Tampere) > >Cc: mingo@elte.hu; linux-kernel@vger.kernel.org; tglx > >Subject: Re: Bug in scheduler when using rt_mutex > > > >On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: > >> Hi > >> > >> I believe that there are some problems in the scheduling when > >> the following happens: > >> - Normal priority process locks rt_mutex and sleeps while keeping it > >> locked. > > > >There's your fail, don't do that! > > So that is forbidden: > > rt_mutex_lock(); > wait_for_completion(); <--- shared HW finishes its job > rt_mutex_unlock(); Well, its pointless, its non-deterministic, so you totally void the usage of rt_mutex. > >Why does I2C core use rt_mutex, that's utterly broken. > > To get low priority task finish ongoing I2C access in time under > heavy load cases I think. FYI, I'm queueing a revert for that patch. Random driver junk should not _ever_ use that. > >> Based on my debugging following sequence occurs (single CPU > >> system): > >> > >> 1) There is some user process running at the background (like > >> cat /dev/zero..) > >> 2) User process reads sysfs entry which causes I2C acccess > >> 3) User process locks rt_mutex in the I2C-core > >> 4) User process sleeps while it keeps rt_mutex locked > >> (wait_for_completion in I2C transfer function) > > > >That's where things go wrong, there's absolutely nothing you can do to > >fix the system once you block while holding a mutex. > > Of course other processes are waiting until the (rt_)mutex is unlocked. > Problem is that after the rt_mutex_unlock is done, the task which just released > the lock, may be in some non-running state for minutes. Yeah, saw that, writing a patch for that, there's more than one problem there. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-17 14:42 Bug in scheduler when using rt_mutex Onkalo Samu 2011-01-17 15:00 ` Peter Zijlstra @ 2011-01-17 16:00 ` Peter Zijlstra 2011-01-18 8:23 ` Onkalo Samu 1 sibling, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2011-01-17 16:00 UTC (permalink / raw) To: samu.p.onkalo; +Cc: mingo, linux-kernel@vger.kernel.org, tglx On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: > > Failure case: > - user process locks rt_mutex > - and goes to sleep (wait_for_completion etc.) > - user process is dequeued to sleep state > -> vruntime is not updated in dequeue_entity > Does the below (completely untested) patch help? --- kernel/sched.c | 6 +++++- kernel/sched_fair.c | 11 +++++++++++ 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index a0eb094..be09581 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -8108,8 +8108,10 @@ EXPORT_SYMBOL(__might_sleep); #ifdef CONFIG_MAGIC_SYSRQ static void normalize_task(struct rq *rq, struct task_struct *p) { + struct sched_class *prev_class = p->sched_class; + int old_prio = p->prio; int on_rq; - + on_rq = p->se.on_rq; if (on_rq) deactivate_task(rq, p, 0); @@ -8118,6 +8120,8 @@ static void normalize_task(struct rq *rq, struct task_struct *p) activate_task(rq, p, 0); resched_task(rq->curr); } + + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); } void normalize_rt_tasks(void) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index c62ebae..0a27b00 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -4072,6 +4072,17 @@ static void prio_changed_fair(struct rq *rq, struct task_struct *p, static void switched_to_fair(struct rq *rq, struct task_struct *p, int running) { + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + if (se->on_rq && cfs_rq->curr != se) + __dequeue_entity(cfs_rq, se); + + place_entity(cfs_rq, se, 0); + + if (se->on_rq && cfs_rq->curr != se) + __enqueue_entity(cfs_rq, se); + /* * We were most likely switched from sched_rt, so * kick off the schedule if running, otherwise just see ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-17 16:00 ` Peter Zijlstra @ 2011-01-18 8:23 ` Onkalo Samu 2011-01-18 8:59 ` Yong Zhang 0 siblings, 1 reply; 42+ messages in thread From: Onkalo Samu @ 2011-01-18 8:23 UTC (permalink / raw) To: ext Peter Zijlstra Cc: mingo, linux-kernel@vger.kernel.org, tglx, Onkalo Samu.P On Mon, 2011-01-17 at 17:00 +0100, ext Peter Zijlstra wrote: > On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: > > > > Failure case: > > - user process locks rt_mutex > > - and goes to sleep (wait_for_completion etc.) > > - user process is dequeued to sleep state > > -> vruntime is not updated in dequeue_entity > > > > Does the below (completely untested) patch help? Unfortunately no. User process is still stucked. It is stucked for about the time equal to min_vruntime. Background of the rt_mutex in i2c-core: http://www.mail-archive.com/linux-i2c@vger.kernel.org/msg01631.html In phones the touch screen controller can be connected to I2C which makes it very timing sensitive. The patch below shows what goes wrong and at least it corrects the case what I can see. Idea is to reset vruntime to min_runtime when the task goes back from rt to cfs queue. -Samu --- include/linux/sched.h | 1 + kernel/sched.c | 12 +++++++++--- kernel/sched_fair.c | 8 ++++++++ 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d747f94..1c7f873 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1049,6 +1049,7 @@ struct sched_domain; #define ENQUEUE_WAKEUP 1 #define ENQUEUE_WAKING 2 #define ENQUEUE_HEAD 4 +#define ENQUEUE_FROM_RTMUTEX 8 #define DEQUEUE_SLEEP 1 diff --git a/kernel/sched.c b/kernel/sched.c index ea3e5ef..4724e25 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -4544,6 +4544,7 @@ void rt_mutex_setprio(struct task_struct *p, int prio) { unsigned long flags; int oldprio, on_rq, running; + int enqueue_flags = 0; struct rq *rq; const struct sched_class *prev_class; @@ -4561,17 +4562,22 @@ void rt_mutex_setprio(struct task_struct *p, int prio) if (running) p->sched_class->put_prev_task(rq, p); - if (rt_prio(prio)) + if (rt_prio(prio)) { p->sched_class = &rt_sched_class; - else + } else { p->sched_class = &fair_sched_class; + if (p->sched_class != prev_class) + enqueue_flags = ENQUEUE_FROM_RTMUTEX; + } p->prio = prio; if (running) p->sched_class->set_curr_task(rq); if (on_rq) { - enqueue_task(rq, p, oldprio < prio ? ENQUEUE_HEAD : 0); + + enqueue_task(rq, p, (oldprio < prio ? ENQUEUE_HEAD : 0) | + enqueue_flags); check_class_changed(rq, p, prev_class, oldprio, running); } diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index c62ebae..ff670b4 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -941,6 +941,14 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) static void enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { + + /* + * vruntime may be incorrect if coming from rt to non rt priority. + * Reset vruntime so that it has some valid value. WAKE* flags are not + * set in this case. + */ + if (unlikely(flags & ENQUEUE_FROM_RTMUTEX)) + se->vruntime = 0; /* * Update the normalized vruntime before updating min_vruntime * through callig update_curr(). -- > > --- > kernel/sched.c | 6 +++++- > kernel/sched_fair.c | 11 +++++++++++ > 2 files changed, 16 insertions(+), 1 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index a0eb094..be09581 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -8108,8 +8108,10 @@ EXPORT_SYMBOL(__might_sleep); > #ifdef CONFIG_MAGIC_SYSRQ > static void normalize_task(struct rq *rq, struct task_struct *p) > { > + struct sched_class *prev_class = p->sched_class; > + int old_prio = p->prio; > int on_rq; > - > + > on_rq = p->se.on_rq; > if (on_rq) > deactivate_task(rq, p, 0); > @@ -8118,6 +8120,8 @@ static void normalize_task(struct rq *rq, struct task_struct *p) > activate_task(rq, p, 0); > resched_task(rq->curr); > } > + > + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); > } > > void normalize_rt_tasks(void) > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c > index c62ebae..0a27b00 100644 > --- a/kernel/sched_fair.c > +++ b/kernel/sched_fair.c > @@ -4072,6 +4072,17 @@ static void prio_changed_fair(struct rq *rq, struct task_struct *p, > static void switched_to_fair(struct rq *rq, struct task_struct *p, > int running) > { > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + if (se->on_rq && cfs_rq->curr != se) > + __dequeue_entity(cfs_rq, se); > + > + place_entity(cfs_rq, se, 0); > + > + if (se->on_rq && cfs_rq->curr != se) > + __enqueue_entity(cfs_rq, se); > + > /* > * We were most likely switched from sched_rt, so > * kick off the schedule if running, otherwise just see > ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-18 8:23 ` Onkalo Samu @ 2011-01-18 8:59 ` Yong Zhang 2011-01-18 13:35 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-18 8:59 UTC (permalink / raw) To: samu.p.onkalo Cc: ext Peter Zijlstra, mingo, linux-kernel@vger.kernel.org, tglx On Tue, Jan 18, 2011 at 4:23 PM, Onkalo Samu <samu.p.onkalo@nokia.com> wrote: > On Mon, 2011-01-17 at 17:00 +0100, ext Peter Zijlstra wrote: >> On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: >> > >> > Failure case: >> > - user process locks rt_mutex >> > - and goes to sleep (wait_for_completion etc.) >> > - user process is dequeued to sleep state >> > -> vruntime is not updated in dequeue_entity >> > >> >> Does the below (completely untested) patch help? > > Unfortunately no. It couldn't work because place_entity() will not place it backwards. -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-18 8:59 ` Yong Zhang @ 2011-01-18 13:35 ` Peter Zijlstra 2011-01-18 14:25 ` Onkalo Samu 2011-01-19 2:38 ` Yong Zhang 0 siblings, 2 replies; 42+ messages in thread From: Peter Zijlstra @ 2011-01-18 13:35 UTC (permalink / raw) To: Yong Zhang; +Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Tue, 2011-01-18 at 16:59 +0800, Yong Zhang wrote: > On Tue, Jan 18, 2011 at 4:23 PM, Onkalo Samu <samu.p.onkalo@nokia.com> wrote: > > On Mon, 2011-01-17 at 17:00 +0100, ext Peter Zijlstra wrote: > >> On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: > >> > > >> > Failure case: > >> > - user process locks rt_mutex > >> > - and goes to sleep (wait_for_completion etc.) > >> > - user process is dequeued to sleep state > >> > -> vruntime is not updated in dequeue_entity > >> > > >> > >> Does the below (completely untested) patch help? > > > > Unfortunately no. > > It couldn't work because place_entity() will not place it > backwards. Ah indeed, I was somehow assuming it was way left, but that is not at all true, something like the below then.. --- Subject: sched: Fix switch_to_fair() From: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Mon Jan 17 17:03:27 CET 2011 When a task is placed back into fair_sched_class, we must update its placement, since we don't know how long its been gone, hence its vruntime is stale and cannot be trusted. There is also a case where it was moved from fair_sched_class when it was in a blocked state and moved back while it is running, this causes an imbalance between DEQUEUE_SLEEP/DEQUEUE_WAKEUP for the fair class and leaves vruntime way out there (due to the min_vruntime adjustment). Also update sysrq-n to call the ->switch_{to,from} methods. Reported-by: Onkalo Samu <samu.p.onkalo@nokia.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- kernel/sched.c | 4 ++++ kernel/sched_fair.c | 16 ++++++++++++++++ 2 files changed, 20 insertions(+) Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -8106,6 +8106,8 @@ EXPORT_SYMBOL(__might_sleep); #ifdef CONFIG_MAGIC_SYSRQ static void normalize_task(struct rq *rq, struct task_struct *p) { + struct sched_class *prev_class = p->sched_class; + int old_prio = p->prio; int on_rq; on_rq = p->se.on_rq; @@ -8116,6 +8118,8 @@ static void normalize_task(struct rq *rq activate_task(rq, p, 0); resched_task(rq->curr); } + + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); } void normalize_rt_tasks(void) Index: linux-2.6/kernel/sched_fair.c =================================================================== --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -4075,6 +4075,22 @@ static void prio_changed_fair(struct rq static void switched_to_fair(struct rq *rq, struct task_struct *p, int running) { + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + if (se->on_rq && cfs_rq->curr != se) + __dequeue_entity(cfs_rq, se); + + /* + * se->vruntime can be completely out there, there is no telling + * how long this task was !fair and on what CPU if any it became + * !fair. Therefore, reset it to a known, reasonable value. + */ + se->vruntime = cfs_rq->min_vruntime; + + if (se->on_rq && cfs_rq->curr != se) + __enqueue_entity(cfs_rq, se); + /* * We were most likely switched from sched_rt, so * kick off the schedule if running, otherwise just see ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-18 13:35 ` Peter Zijlstra @ 2011-01-18 14:25 ` Onkalo Samu 2011-01-19 2:38 ` Yong Zhang 1 sibling, 0 replies; 42+ messages in thread From: Onkalo Samu @ 2011-01-18 14:25 UTC (permalink / raw) To: ext Peter Zijlstra; +Cc: Yong Zhang, mingo, linux-kernel@vger.kernel.org, tglx On Tue, 2011-01-18 at 14:35 +0100, ext Peter Zijlstra wrote: > On Tue, 2011-01-18 at 16:59 +0800, Yong Zhang wrote: > > On Tue, Jan 18, 2011 at 4:23 PM, Onkalo Samu <samu.p.onkalo@nokia.com> wrote: > > > On Mon, 2011-01-17 at 17:00 +0100, ext Peter Zijlstra wrote: > > >> On Mon, 2011-01-17 at 16:42 +0200, Onkalo Samu wrote: > > >> > > > >> > Failure case: > > >> > - user process locks rt_mutex > > >> > - and goes to sleep (wait_for_completion etc.) > > >> > - user process is dequeued to sleep state > > >> > -> vruntime is not updated in dequeue_entity > > >> > > > >> > > >> Does the below (completely untested) patch help? > > > > > > Unfortunately no. > > > > It couldn't work because place_entity() will not place it > > backwards. > > Ah indeed, I was somehow assuming it was way left, but that is not at > all true, something like the below then.. > First trials show that my test case is not stucked. I'll continue testing. Thanks. Howabout this i2c-core case. Do you see that rt_mutex must be taken away? It reduces latencies from the I2C accesses in the irq-threads. -Samu > --- > Subject: sched: Fix switch_to_fair() > From: Peter Zijlstra <a.p.zijlstra@chello.nl> > Date: Mon Jan 17 17:03:27 CET 2011 > > When a task is placed back into fair_sched_class, we must update its > placement, since we don't know how long its been gone, hence its > vruntime is stale and cannot be trusted. > > There is also a case where it was moved from fair_sched_class when it > was in a blocked state and moved back while it is running, this causes > an imbalance between DEQUEUE_SLEEP/DEQUEUE_WAKEUP for the fair class > and leaves vruntime way out there (due to the min_vruntime > adjustment). > > Also update sysrq-n to call the ->switch_{to,from} methods. > > Reported-by: Onkalo Samu <samu.p.onkalo@nokia.com> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > --- > kernel/sched.c | 4 ++++ > kernel/sched_fair.c | 16 ++++++++++++++++ > 2 files changed, 20 insertions(+) > > Index: linux-2.6/kernel/sched.c > =================================================================== > --- linux-2.6.orig/kernel/sched.c > +++ linux-2.6/kernel/sched.c > @@ -8106,6 +8106,8 @@ EXPORT_SYMBOL(__might_sleep); > #ifdef CONFIG_MAGIC_SYSRQ > static void normalize_task(struct rq *rq, struct task_struct *p) > { > + struct sched_class *prev_class = p->sched_class; > + int old_prio = p->prio; > int on_rq; > > on_rq = p->se.on_rq; > @@ -8116,6 +8118,8 @@ static void normalize_task(struct rq *rq > activate_task(rq, p, 0); > resched_task(rq->curr); > } > + > + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); > } > > void normalize_rt_tasks(void) > Index: linux-2.6/kernel/sched_fair.c > =================================================================== > --- linux-2.6.orig/kernel/sched_fair.c > +++ linux-2.6/kernel/sched_fair.c > @@ -4075,6 +4075,22 @@ static void prio_changed_fair(struct rq > static void switched_to_fair(struct rq *rq, struct task_struct *p, > int running) > { > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + if (se->on_rq && cfs_rq->curr != se) > + __dequeue_entity(cfs_rq, se); > + > + /* > + * se->vruntime can be completely out there, there is no telling > + * how long this task was !fair and on what CPU if any it became > + * !fair. Therefore, reset it to a known, reasonable value. > + */ > + se->vruntime = cfs_rq->min_vruntime; > + > + if (se->on_rq && cfs_rq->curr != se) > + __enqueue_entity(cfs_rq, se); > + > /* > * We were most likely switched from sched_rt, so > * kick off the schedule if running, otherwise just see > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-18 13:35 ` Peter Zijlstra 2011-01-18 14:25 ` Onkalo Samu @ 2011-01-19 2:38 ` Yong Zhang 2011-01-19 3:43 ` Mike Galbraith 2011-01-19 9:44 ` Peter Zijlstra 1 sibling, 2 replies; 42+ messages in thread From: Yong Zhang @ 2011-01-19 2:38 UTC (permalink / raw) To: Peter Zijlstra; +Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Tue, Jan 18, 2011 at 9:35 PM, Peter Zijlstra <peterz@infradead.org> wrote: > Subject: sched: Fix switch_to_fair() > From: Peter Zijlstra <a.p.zijlstra@chello.nl> > Date: Mon Jan 17 17:03:27 CET 2011 > > When a task is placed back into fair_sched_class, we must update its > placement, since we don't know how long its been gone, hence its > vruntime is stale and cannot be trusted. > > There is also a case where it was moved from fair_sched_class when it > was in a blocked state and moved back while it is running, this causes > an imbalance between DEQUEUE_SLEEP/DEQUEUE_WAKEUP for the fair class > and leaves vruntime way out there (due to the min_vruntime > adjustment). > > Also update sysrq-n to call the ->switch_{to,from} methods. > > Reported-by: Onkalo Samu <samu.p.onkalo@nokia.com> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > --- > kernel/sched.c | 4 ++++ > kernel/sched_fair.c | 16 ++++++++++++++++ > 2 files changed, 20 insertions(+) > > Index: linux-2.6/kernel/sched_fair.c > =================================================================== > --- linux-2.6.orig/kernel/sched_fair.c > +++ linux-2.6/kernel/sched_fair.c > @@ -4075,6 +4075,22 @@ static void prio_changed_fair(struct rq > static void switched_to_fair(struct rq *rq, struct task_struct *p, > int running) > { > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + if (se->on_rq && cfs_rq->curr != se) (cfs_rq->curr != se) equals to (!running), no? > + __dequeue_entity(cfs_rq, se); > + > + /* > + * se->vruntime can be completely out there, there is no telling > + * how long this task was !fair and on what CPU if any it became > + * !fair. Therefore, reset it to a known, reasonable value. > + */ > + se->vruntime = cfs_rq->min_vruntime; But this is not fair for !SLEEP task. You know se->vruntime -= cfs_rq->min_vruntime for !SLEEP task, then after it go through sched_fair-->sched_rt-->sched_fair by some means, current cfs_rq->min_vruntime is added back. But here se is putted before where it should be. Is this what we want? Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 2:38 ` Yong Zhang @ 2011-01-19 3:43 ` Mike Galbraith 2011-01-19 4:35 ` Yong Zhang 2011-01-19 9:44 ` Peter Zijlstra 1 sibling, 1 reply; 42+ messages in thread From: Mike Galbraith @ 2011-01-19 3:43 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 10:38 +0800, Yong Zhang wrote: > > Index: linux-2.6/kernel/sched_fair.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_fair.c > > +++ linux-2.6/kernel/sched_fair.c > > @@ -4075,6 +4075,22 @@ static void prio_changed_fair(struct rq > > static void switched_to_fair(struct rq *rq, struct task_struct *p, > > int running) > > { > > + struct sched_entity *se = &p->se; > > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > > + > > + if (se->on_rq && cfs_rq->curr != se) > > (cfs_rq->curr != se) equals to (!running), no? No, running is task_of(se) == rq->curr. Another class or fair group task may be rq_of(cfs_rq)->curr > > + __dequeue_entity(cfs_rq, se); > > + > > + /* > > + * se->vruntime can be completely out there, there is no telling > > + * how long this task was !fair and on what CPU if any it became > > + * !fair. Therefore, reset it to a known, reasonable value. > > + */ > > + se->vruntime = cfs_rq->min_vruntime; > > But this is not fair for !SLEEP task. > You know se->vruntime -= cfs_rq->min_vruntime for !SLEEP task, > then after it go through sched_fair-->sched_rt-->sched_fair by some > means, current cfs_rq->min_vruntime is added back. It drops lag for all, positive or negative. > But here se is putted before where it should be. Is this what we want? It may move forward or backward. If transitions can happen at high frequency it could be a problem, otherwise, it's a cornercase blip. An alternative is to leave lag alone. and normalize sleepers, but that's (did that) considerably more intrusive. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 3:43 ` Mike Galbraith @ 2011-01-19 4:35 ` Yong Zhang 2011-01-19 5:40 ` Mike Galbraith 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-19 4:35 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, Jan 19, 2011 at 11:43 AM, Mike Galbraith <efault@gmx.de> wrote: >> (cfs_rq->curr != se) equals to (!running), no? > > No, running is task_of(se) == rq->curr. Another class or fair group > task may be rq_of(cfs_rq)->curr If task_of(se) != rq->curr, as you said it maybe on another class or fair group; then for its fair group, cfs_rq->curr == NULL. cfs_rq->curr != se is always true. Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 4:35 ` Yong Zhang @ 2011-01-19 5:40 ` Mike Galbraith 2011-01-19 6:09 ` Yong Zhang 0 siblings, 1 reply; 42+ messages in thread From: Mike Galbraith @ 2011-01-19 5:40 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 12:35 +0800, Yong Zhang wrote: > cfs_rq->curr != se is always true. If that were always true, we'd illegally enqueue a running task. if (running) p->sched_class->set_curr_task(rq); ==> set_curr_task_fair(rq); ==> struct sched_entity *se = &rq->curr->se; ==> set_next_entity(cfs_rq_of(se), se); ==> cfs_rq->curr = se; (prevents running entity from being enqueued below) if (on_rq) { enqueue_task(rq, p, oldprio < prio ? ENQUEUE_HEAD : 0); check_class_changed(rq, p, prev_class, oldprio, running); } -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 5:40 ` Mike Galbraith @ 2011-01-19 6:09 ` Yong Zhang 2011-01-19 6:37 ` Mike Galbraith 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-19 6:09 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, Jan 19, 2011 at 1:40 PM, Mike Galbraith <efault@gmx.de> wrote: > On Wed, 2011-01-19 at 12:35 +0800, Yong Zhang wrote: > >> cfs_rq->curr != se is always true. > > If that were always true, we'd illegally enqueue a running task. I'm sorry that I'm not express myself correctly. The conclusion of (cfs_rq->curr != se is always true) is not self-contained. IOW, it's based on one condition which is (task_of(se) != rq->curr). So what I want to say is: task_of(se) != rq->curr ==> cfs_rq_of(se)->curr != se So, !running ==> cfs_rq_of(se)->curr != se Is this more clear? Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 6:09 ` Yong Zhang @ 2011-01-19 6:37 ` Mike Galbraith 2011-01-19 7:19 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Mike Galbraith @ 2011-01-19 6:37 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 14:09 +0800, Yong Zhang wrote: > On Wed, Jan 19, 2011 at 1:40 PM, Mike Galbraith <efault@gmx.de> wrote: > > On Wed, 2011-01-19 at 12:35 +0800, Yong Zhang wrote: > > > >> cfs_rq->curr != se is always true. > > > > If that were always true, we'd illegally enqueue a running task. > > I'm sorry that I'm not express myself correctly. Human communication methods are all buggy as hell :) > The conclusion of (cfs_rq->curr != se is always true) is not > self-contained. IOW, it's based on one condition which is > (task_of(se) != rq->curr). So what I want to say is: > task_of(se) != rq->curr ==> cfs_rq_of(se)->curr != se > So, > !running ==> cfs_rq_of(se)->curr != se > > Is this more clear? Yeah. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 6:37 ` Mike Galbraith @ 2011-01-19 7:19 ` Ingo Molnar 2011-01-19 7:41 ` Mike Galbraith 0 siblings, 1 reply; 42+ messages in thread From: Ingo Molnar @ 2011-01-19 7:19 UTC (permalink / raw) To: Mike Galbraith Cc: Yong Zhang, Peter Zijlstra, samu.p.onkalo, linux-kernel@vger.kernel.org, tglx * Mike Galbraith <efault@gmx.de> wrote: > On Wed, 2011-01-19 at 14:09 +0800, Yong Zhang wrote: > > On Wed, Jan 19, 2011 at 1:40 PM, Mike Galbraith <efault@gmx.de> wrote: > > > On Wed, 2011-01-19 at 12:35 +0800, Yong Zhang wrote: > > > > > >> cfs_rq->curr != se is always true. > > > > > > If that were always true, we'd illegally enqueue a running task. > > > > I'm sorry that I'm not express myself correctly. > > Human communication methods are all buggy as hell :) Not to mention that they are slow, inefficient and ambiguous. But wht did you expect? The original authors of the code are long gone and maintenance is done by newcomers who are patching the code bit by bit. What you get from such a development model is pretty predictable: ~1 billion years old spaghetti DNA that no-one truly understands. Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 7:19 ` Ingo Molnar @ 2011-01-19 7:41 ` Mike Galbraith 0 siblings, 0 replies; 42+ messages in thread From: Mike Galbraith @ 2011-01-19 7:41 UTC (permalink / raw) To: Ingo Molnar Cc: Yong Zhang, Peter Zijlstra, samu.p.onkalo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 08:19 +0100, Ingo Molnar wrote: > * Mike Galbraith <efault@gmx.de> wrote: > > > On Wed, 2011-01-19 at 14:09 +0800, Yong Zhang wrote: > > > On Wed, Jan 19, 2011 at 1:40 PM, Mike Galbraith <efault@gmx.de> wrote: > > > > On Wed, 2011-01-19 at 12:35 +0800, Yong Zhang wrote: > > > > > > > >> cfs_rq->curr != se is always true. > > > > > > > > If that were always true, we'd illegally enqueue a running task. > > > > > > I'm sorry that I'm not express myself correctly. > > > > Human communication methods are all buggy as hell :) > > Not to mention that they are slow, inefficient and ambiguous. > > But wht did you expect? The original authors of the code are long gone and > maintenance is done by newcomers who are patching the code bit by bit. What > you get from such a development model is pretty predictable: ~1 billion years > old spaghetti DNA that no-one truly understands. Gotta give the original authors credit though, their self modifying code comes equipped with a fully automated hardware break-point debugger. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 2:38 ` Yong Zhang 2011-01-19 3:43 ` Mike Galbraith @ 2011-01-19 9:44 ` Peter Zijlstra 2011-01-19 10:38 ` Peter Zijlstra 2011-01-20 3:10 ` Yong Zhang 1 sibling, 2 replies; 42+ messages in thread From: Peter Zijlstra @ 2011-01-19 9:44 UTC (permalink / raw) To: Yong Zhang; +Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 10:38 +0800, Yong Zhang wrote: > > Index: linux-2.6/kernel/sched_fair.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_fair.c > > +++ linux-2.6/kernel/sched_fair.c > > @@ -4075,6 +4075,22 @@ static void prio_changed_fair(struct rq > > static void switched_to_fair(struct rq *rq, struct task_struct *p, > > int running) > > { > > + struct sched_entity *se = &p->se; > > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > > + > > + if (se->on_rq && cfs_rq->curr != se) > > (cfs_rq->curr != se) equals to (!running), no? more or less, the idea is that we only call __{dequeue,enqueue}_entity() when the task is actually in the tree and current is not. > > + __dequeue_entity(cfs_rq, se); > > + > > + /* > > + * se->vruntime can be completely out there, there is no telling > > + * how long this task was !fair and on what CPU if any it became > > + * !fair. Therefore, reset it to a known, reasonable value. > > + */ > > + se->vruntime = cfs_rq->min_vruntime; > > But this is not fair for !SLEEP task. > You know se->vruntime -= cfs_rq->min_vruntime for !SLEEP task, > then after it go through sched_fair-->sched_rt-->sched_fair by some > means, current cfs_rq->min_vruntime is added back. > > But here se is putted before where it should be. Is this what we want? well, its more or less screwy anyway, since we don't know for how long the task was !fair and what cpu it came from etc.. But I guess you're right, we should at least pretend the whole min_vruntime thing is the 0-lag point (its not) and preserve 'lag' like we do for migrations... Something like the below.. except I've got a massive head ache and I'm not at all sure I got the switched_from_fair() bit right. --- Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -8108,6 +8108,8 @@ EXPORT_SYMBOL(__might_sleep); #ifdef CONFIG_MAGIC_SYSRQ static void normalize_task(struct rq *rq, struct task_struct *p) { + struct sched_class *prev_class = p->sched_class; + int old_prio = p->prio; int on_rq; on_rq = p->se.on_rq; @@ -8118,6 +8120,8 @@ static void normalize_task(struct rq *rq activate_task(rq, p, 0); resched_task(rq->curr); } + + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); } void normalize_rt_tasks(void) Index: linux-2.6/kernel/sched_fair.c =================================================================== --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -4066,12 +4066,33 @@ static void prio_changed_fair(struct rq check_preempt_curr(rq, p, 0); } +static void +switched_from_fair(struct rq *rq, struct task_struct *p, int running) +{ + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + if (!se->on_rq && p->state != TASK_RUNNING) + se->vruntime -= cfs_rq->min_vruntime; +} + /* * We switched to the sched_fair class. */ -static void switched_to_fair(struct rq *rq, struct task_struct *p, - int running) +static void +switched_to_fair(struct rq *rq, struct task_struct *p, int running) { + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + if (se->on_rq && cfs_rq->curr != se) + __dequeue_entity(cfs_rq, se); + + se->vruntimea += cfs_rq->min_vruntime; + + if (se->on_rq && cfs_rq->curr != se) + __enqueue_entity(cfs_rq, se); + /* * We were most likely switched from sched_rt, so * kick off the schedule if running, otherwise just see @@ -4163,6 +4184,7 @@ static const struct sched_class fair_sch .task_fork = task_fork_fair, .prio_changed = prio_changed_fair, + .switched_from = switched_from_fair, .switched_to = switched_to_fair, .get_rr_interval = get_rr_interval_fair, ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 9:44 ` Peter Zijlstra @ 2011-01-19 10:38 ` Peter Zijlstra 2011-01-19 11:30 ` Peter Zijlstra 2011-01-20 3:10 ` Yong Zhang 1 sibling, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2011-01-19 10:38 UTC (permalink / raw) To: Yong Zhang; +Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 10:44 +0100, Peter Zijlstra wrote: > +static void > +switched_from_fair(struct rq *rq, struct task_struct *p, int running) > +{ > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + if (!se->on_rq && p->state != TASK_RUNNING) > + se->vruntime -= cfs_rq->min_vruntime; > +} > + > /* > * We switched to the sched_fair class. > */ > -static void switched_to_fair(struct rq *rq, struct task_struct *p, > - int running) > +static void > +switched_to_fair(struct rq *rq, struct task_struct *p, int running) > { > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + if (se->on_rq && cfs_rq->curr != se) > + __dequeue_entity(cfs_rq, se); > + > + se->vruntimea += cfs_rq->min_vruntime; > + > + if (se->on_rq && cfs_rq->curr != se) > + __enqueue_entity(cfs_rq, se); > + > /* > * We were most likely switched from sched_rt, so > * kick off the schedule if running, otherwise just see Hrm, I think the to bit is not needed with the from thing in place, the enqueue from _setprio will already have added min_vruntime ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 10:38 ` Peter Zijlstra @ 2011-01-19 11:30 ` Peter Zijlstra 2011-01-19 12:58 ` Onkalo Samu 0 siblings, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2011-01-19 11:30 UTC (permalink / raw) To: Yong Zhang; +Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 11:38 +0100, Peter Zijlstra wrote: > Hrm, I think the to bit is not needed with the from thing in place, the > enqueue from _setprio will already have added min_vruntime Would lead to something like this: --- Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -8108,6 +8108,8 @@ EXPORT_SYMBOL(__might_sleep); #ifdef CONFIG_MAGIC_SYSRQ static void normalize_task(struct rq *rq, struct task_struct *p) { + struct sched_class *prev_class = p->sched_class; + int old_prio = p->prio; int on_rq; on_rq = p->se.on_rq; @@ -8118,6 +8120,8 @@ static void normalize_task(struct rq *rq activate_task(rq, p, 0); resched_task(rq->curr); } + + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); } void normalize_rt_tasks(void) Index: linux-2.6/kernel/sched_fair.c =================================================================== --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -4066,11 +4066,30 @@ static void prio_changed_fair(struct rq check_preempt_curr(rq, p, 0); } +static void +switched_from_fair(struct rq *rq, struct task_struct *p, int running) +{ + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + /* + * Ensure the task's vruntime is normalized, so that when its + * switched back to the fair class the enqueue_entity(.flags=0) will + * do the right thing. + * + * If it was on_rq, then the dequeue_entity(.flags=0) will already + * have normalized the vruntime, if it was !on_rq, then only when + * the task is sleeping will it still have non-normalized vruntime. + */ + if (!se->on_rq && p->state != TASK_RUNNING) + se->vruntime -= cfs_rq->min_vruntime; +} + /* * We switched to the sched_fair class. */ -static void switched_to_fair(struct rq *rq, struct task_struct *p, - int running) +static void +switched_to_fair(struct rq *rq, struct task_struct *p, int running) { /* * We were most likely switched from sched_rt, so @@ -4163,6 +4182,7 @@ static const struct sched_class fair_sch .task_fork = task_fork_fair, .prio_changed = prio_changed_fair, + .switched_from = switched_from_fair, .switched_to = switched_to_fair, .get_rr_interval = get_rr_interval_fair, ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 11:30 ` Peter Zijlstra @ 2011-01-19 12:58 ` Onkalo Samu 2011-01-19 13:13 ` Onkalo Samu 0 siblings, 1 reply; 42+ messages in thread From: Onkalo Samu @ 2011-01-19 12:58 UTC (permalink / raw) To: ext Peter Zijlstra Cc: Yong Zhang, mingo, linux-kernel@vger.kernel.org, tglx, Onkalo Samu.P On Wed, 2011-01-19 at 12:30 +0100, ext Peter Zijlstra wrote: > On Wed, 2011-01-19 at 11:38 +0100, Peter Zijlstra wrote: > > > Hrm, I think the to bit is not needed with the from thing in place, the > > enqueue from _setprio will already have added min_vruntime > > > Would lead to something like this: > Doesn't work in my case. > --- > Index: linux-2.6/kernel/sched.c > =================================================================== > --- linux-2.6.orig/kernel/sched.c > +++ linux-2.6/kernel/sched.c > @@ -8108,6 +8108,8 @@ EXPORT_SYMBOL(__might_sleep); > #ifdef CONFIG_MAGIC_SYSRQ > static void normalize_task(struct rq *rq, struct task_struct *p) > { > + struct sched_class *prev_class = p->sched_class; > + int old_prio = p->prio; > int on_rq; > > on_rq = p->se.on_rq; > @@ -8118,6 +8120,8 @@ static void normalize_task(struct rq *rq > activate_task(rq, p, 0); > resched_task(rq->curr); > } > + > + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); > } > > void normalize_rt_tasks(void) > Index: linux-2.6/kernel/sched_fair.c > =================================================================== > --- linux-2.6.orig/kernel/sched_fair.c > +++ linux-2.6/kernel/sched_fair.c > @@ -4066,11 +4066,30 @@ static void prio_changed_fair(struct rq > check_preempt_curr(rq, p, 0); > } > > +static void > +switched_from_fair(struct rq *rq, struct task_struct *p, int running) > +{ > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + /* > + * Ensure the task's vruntime is normalized, so that when its > + * switched back to the fair class the enqueue_entity(.flags=0) will > + * do the right thing. > + * > + * If it was on_rq, then the dequeue_entity(.flags=0) will already > + * have normalized the vruntime, if it was !on_rq, then only when > + * the task is sleeping will it still have non-normalized vruntime. > + */ > + if (!se->on_rq && p->state != TASK_RUNNING) > + se->vruntime -= cfs_rq->min_vruntime; > +} > + > /* > * We switched to the sched_fair class. > */ > -static void switched_to_fair(struct rq *rq, struct task_struct *p, > - int running) > +static void > +switched_to_fair(struct rq *rq, struct task_struct *p, int running) > { > /* > * We were most likely switched from sched_rt, so > @@ -4163,6 +4182,7 @@ static const struct sched_class fair_sch > .task_fork = task_fork_fair, > > .prio_changed = prio_changed_fair, > + .switched_from = switched_from_fair, > .switched_to = switched_to_fair, > > .get_rr_interval = get_rr_interval_fair, > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 12:58 ` Onkalo Samu @ 2011-01-19 13:13 ` Onkalo Samu 2011-01-19 13:30 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: Onkalo Samu @ 2011-01-19 13:13 UTC (permalink / raw) To: ext Peter Zijlstra; +Cc: Yong Zhang, mingo, linux-kernel@vger.kernel.org, tglx On Wed, 2011-01-19 at 14:58 +0200, Onkalo Samu wrote: > On Wed, 2011-01-19 at 12:30 +0100, ext Peter Zijlstra wrote: > > On Wed, 2011-01-19 at 11:38 +0100, Peter Zijlstra wrote: > > > > > Hrm, I think the to bit is not needed with the from thing in place, the > > > enqueue from _setprio will already have added min_vruntime > > > > > > Would lead to something like this: > > > > Doesn't work in my case. When the task is sleeping rt_mutex_setprio doesn't call check_class_changed since the task is not in queue at that moment I think. > > > --- > > Index: linux-2.6/kernel/sched.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched.c > > +++ linux-2.6/kernel/sched.c > > @@ -8108,6 +8108,8 @@ EXPORT_SYMBOL(__might_sleep); > > #ifdef CONFIG_MAGIC_SYSRQ > > static void normalize_task(struct rq *rq, struct task_struct *p) > > { > > + struct sched_class *prev_class = p->sched_class; > > + int old_prio = p->prio; > > int on_rq; > > > > on_rq = p->se.on_rq; > > @@ -8118,6 +8120,8 @@ static void normalize_task(struct rq *rq > > activate_task(rq, p, 0); > > resched_task(rq->curr); > > } > > + > > + check_class_changed(rq, p, prev_class, old_prio, task_current(rq, p)); > > } > > > > void normalize_rt_tasks(void) > > Index: linux-2.6/kernel/sched_fair.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_fair.c > > +++ linux-2.6/kernel/sched_fair.c > > @@ -4066,11 +4066,30 @@ static void prio_changed_fair(struct rq > > check_preempt_curr(rq, p, 0); > > } > > > > +static void > > +switched_from_fair(struct rq *rq, struct task_struct *p, int running) > > +{ > > + struct sched_entity *se = &p->se; > > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > > + > > + /* > > + * Ensure the task's vruntime is normalized, so that when its > > + * switched back to the fair class the enqueue_entity(.flags=0) will > > + * do the right thing. > > + * > > + * If it was on_rq, then the dequeue_entity(.flags=0) will already > > + * have normalized the vruntime, if it was !on_rq, then only when > > + * the task is sleeping will it still have non-normalized vruntime. > > + */ > > + if (!se->on_rq && p->state != TASK_RUNNING) > > + se->vruntime -= cfs_rq->min_vruntime; > > +} > > + > > /* > > * We switched to the sched_fair class. > > */ > > -static void switched_to_fair(struct rq *rq, struct task_struct *p, > > - int running) > > +static void > > +switched_to_fair(struct rq *rq, struct task_struct *p, int running) > > { > > /* > > * We were most likely switched from sched_rt, so > > @@ -4163,6 +4182,7 @@ static const struct sched_class fair_sch > > .task_fork = task_fork_fair, > > > > .prio_changed = prio_changed_fair, > > + .switched_from = switched_from_fair, > > .switched_to = switched_to_fair, > > > > .get_rr_interval = get_rr_interval_fair, > > > > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 13:13 ` Onkalo Samu @ 2011-01-19 13:30 ` Peter Zijlstra 2011-01-20 4:18 ` Yong Zhang 2011-01-20 7:07 ` Onkalo Samu 0 siblings, 2 replies; 42+ messages in thread From: Peter Zijlstra @ 2011-01-19 13:30 UTC (permalink / raw) To: samu.p.onkalo Cc: Yong Zhang, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Wed, 2011-01-19 at 15:13 +0200, Onkalo Samu wrote: > > When the task is sleeping rt_mutex_setprio > doesn't call check_class_changed since the task is not > in queue at that moment I think. Ah, indeed, more changes then.. (I think we can also remove the prio_changed_idle thing, since I think we killed that hotplug usage) --- Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -2025,14 +2025,14 @@ inline int task_curr(const struct task_s static inline void check_class_changed(struct rq *rq, struct task_struct *p, const struct sched_class *prev_class, - int oldprio, int running) + int oldprio) { if (prev_class != p->sched_class) { if (prev_class->switched_from) - prev_class->switched_from(rq, p, running); - p->sched_class->switched_to(rq, p, running); - } else - p->sched_class->prio_changed(rq, p, oldprio, running); + prev_class->switched_from(rq, p); + p->sched_class->switched_to(rq, p); + } else if (oldprio != p->prio) + p->sched_class->prio_changed(rq, p, oldprio); } static void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) @@ -4570,11 +4570,10 @@ void rt_mutex_setprio(struct task_struct if (running) p->sched_class->set_curr_task(rq); - if (on_rq) { + if (on_rq) enqueue_task(rq, p, oldprio < prio ? ENQUEUE_HEAD : 0); - check_class_changed(rq, p, prev_class, oldprio, running); - } + check_class_changed(rq, p, prev_class, oldprio); task_rq_unlock(rq, &flags); } @@ -4902,11 +4901,10 @@ static int __sched_setscheduler(struct t if (running) p->sched_class->set_curr_task(rq); - if (on_rq) { + if (on_rq) activate_task(rq, p, 0); - check_class_changed(rq, p, prev_class, oldprio, running); - } + check_class_changed(rq, p, prev_class, oldprio); __task_rq_unlock(rq); raw_spin_unlock_irqrestore(&p->pi_lock, flags); @@ -8109,6 +8107,8 @@ EXPORT_SYMBOL(__might_sleep); #ifdef CONFIG_MAGIC_SYSRQ static void normalize_task(struct rq *rq, struct task_struct *p) { + struct sched_class *prev_class = p->sched_class; + int old_prio = p->prio; int on_rq; on_rq = p->se.on_rq; @@ -8119,6 +8119,8 @@ static void normalize_task(struct rq *rq activate_task(rq, p, 0); resched_task(rq->curr); } + + check_class_changed(rq, p, prev_class, old_prio); } void normalize_rt_tasks(void) Index: linux-2.6/kernel/sched_fair.c =================================================================== --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -4054,33 +4054,56 @@ static void task_fork_fair(struct task_s * Priority of the task has changed. Check to see if we preempt * the current task. */ -static void prio_changed_fair(struct rq *rq, struct task_struct *p, - int oldprio, int running) +static void +prio_changed_fair(struct rq *rq, struct task_struct *p, int oldprio) { + if (!p->se.on_rq) + return; + /* * Reschedule if we are currently running on this runqueue and * our priority decreased, or if we are not currently running on * this runqueue and our priority is higher than the current's */ - if (running) { + if (rq->curr == p) { if (p->prio > oldprio) resched_task(rq->curr); } else check_preempt_curr(rq, p, 0); } +static void switched_from_fair(struct rq *rq, struct task_struct *p) +{ + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq = cfs_rq_of(se); + + /* + * Ensure the task's vruntime is normalized, so that when its + * switched back to the fair class the enqueue_entity(.flags=0) will + * do the right thing. + * + * If it was on_rq, then the dequeue_entity(.flags=0) will already + * have normalized the vruntime, if it was !on_rq, then only when + * the task is sleeping will it still have non-normalized vruntime. + */ + if (!se->on_rq && p->state != TASK_RUNNING) + se->vruntime -= cfs_rq->min_vruntime; +} + /* * We switched to the sched_fair class. */ -static void switched_to_fair(struct rq *rq, struct task_struct *p, - int running) +static void switched_to_fair(struct rq *rq, struct task_struct *p) { + if (!p->se.on_rq) + return; + /* * We were most likely switched from sched_rt, so * kick off the schedule if running, otherwise just see * if we can still preempt the current task. */ - if (running) + if (rq->curr == p) resched_task(rq->curr); else check_preempt_curr(rq, p, 0); @@ -4166,6 +4189,7 @@ static const struct sched_class fair_sch .task_fork = task_fork_fair, .prio_changed = prio_changed_fair, + .switched_from = switched_from_fair, .switched_to = switched_to_fair, .get_rr_interval = get_rr_interval_fair, Index: linux-2.6/include/linux/sched.h =================================================================== --- linux-2.6.orig/include/linux/sched.h +++ linux-2.6/include/linux/sched.h @@ -1084,12 +1084,10 @@ struct sched_class { void (*task_tick) (struct rq *rq, struct task_struct *p, int queued); void (*task_fork) (struct task_struct *p); - void (*switched_from) (struct rq *this_rq, struct task_struct *task, - int running); - void (*switched_to) (struct rq *this_rq, struct task_struct *task, - int running); + void (*switched_from) (struct rq *this_rq, struct task_struct *task); + void (*switched_to) (struct rq *this_rq, struct task_struct *task); void (*prio_changed) (struct rq *this_rq, struct task_struct *task, - int oldprio, int running); + int oldprio); unsigned int (*get_rr_interval) (struct rq *rq, struct task_struct *task); Index: linux-2.6/kernel/sched_idletask.c =================================================================== --- linux-2.6.orig/kernel/sched_idletask.c +++ linux-2.6/kernel/sched_idletask.c @@ -52,8 +52,7 @@ static void set_curr_task_idle(struct rq { } -static void switched_to_idle(struct rq *rq, struct task_struct *p, - int running) +static void switched_to_idle(struct rq *rq, struct task_struct *p) { /* Can this actually happen?? */ if (running) @@ -62,8 +61,8 @@ static void switched_to_idle(struct rq * check_preempt_curr(rq, p, 0); } -static void prio_changed_idle(struct rq *rq, struct task_struct *p, - int oldprio, int running) +static void +prio_changed_idle(struct rq *rq, struct task_struct *p, int oldprio) { /* This can happen for hot plug CPUS */ Index: linux-2.6/kernel/sched_rt.c =================================================================== --- linux-2.6.orig/kernel/sched_rt.c +++ linux-2.6/kernel/sched_rt.c @@ -1595,8 +1595,7 @@ static void rq_offline_rt(struct rq *rq) * When switch from the rt queue, we bring ourselves to a position * that we might want to pull RT tasks from other runqueues. */ -static void switched_from_rt(struct rq *rq, struct task_struct *p, - int running) +static void switched_from_rt(struct rq *rq, struct task_struct *p) { /* * If there are other RT tasks then we will reschedule @@ -1605,7 +1604,7 @@ static void switched_from_rt(struct rq * * we may need to handle the pulling of RT tasks * now. */ - if (!rq->rt.rt_nr_running) + if (p->se.on_rq && !rq->rt.rt_nr_running) pull_rt_task(rq); } @@ -1624,8 +1623,7 @@ static inline void init_sched_rt_class(v * with RT tasks. In this case we try to push them off to * other runqueues. */ -static void switched_to_rt(struct rq *rq, struct task_struct *p, - int running) +static void switched_to_rt(struct rq *rq, struct task_struct *p) { int check_resched = 1; @@ -1636,7 +1634,7 @@ static void switched_to_rt(struct rq *rq * If that current running task is also an RT task * then see if we can move to another run queue. */ - if (!running) { + if (p->se.on_rq && rq->curr != p) { #ifdef CONFIG_SMP if (rq->rt.overloaded && push_rt_task(rq) && /* Don't resched if we changed runqueues */ @@ -1652,10 +1650,13 @@ static void switched_to_rt(struct rq *rq * Priority of the task has changed. This may cause * us to initiate a push or pull. */ -static void prio_changed_rt(struct rq *rq, struct task_struct *p, - int oldprio, int running) +static void +prio_changed_rt(struct rq *rq, struct task_struct *p, int oldprio) { - if (running) { + if (!p->se.on_rq) + return; + + if (rq->curr == p) { #ifdef CONFIG_SMP /* * If our priority decreases while running, we Index: linux-2.6/kernel/sched_stoptask.c =================================================================== --- linux-2.6.orig/kernel/sched_stoptask.c +++ linux-2.6/kernel/sched_stoptask.c @@ -59,14 +59,13 @@ static void set_curr_task_stop(struct rq { } -static void switched_to_stop(struct rq *rq, struct task_struct *p, - int running) +static void switched_to_stop(struct rq *rq, struct task_struct *p) { BUG(); /* its impossible to change to this class */ } -static void prio_changed_stop(struct rq *rq, struct task_struct *p, - int oldprio, int running) +static void +prio_changed_stop(struct rq *rq, struct task_struct *p, int oldprio) { BUG(); /* how!?, what priority? */ } ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 13:30 ` Peter Zijlstra @ 2011-01-20 4:18 ` Yong Zhang 2011-01-20 4:27 ` Yong Zhang 2011-01-20 4:59 ` Mike Galbraith 2011-01-20 7:07 ` Onkalo Samu 1 sibling, 2 replies; 42+ messages in thread From: Yong Zhang @ 2011-01-20 4:18 UTC (permalink / raw) To: Peter Zijlstra Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Wed, Jan 19, 2011 at 9:30 PM, Peter Zijlstra <peterz@infradead.org> wrote: > +static void switched_from_fair(struct rq *rq, struct task_struct *p) > +{ > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + /* > + * Ensure the task's vruntime is normalized, so that when its > + * switched back to the fair class the enqueue_entity(.flags=0) will > + * do the right thing. > + * > + * If it was on_rq, then the dequeue_entity(.flags=0) will already > + * have normalized the vruntime, if it was !on_rq, then only when > + * the task is sleeping will it still have non-normalized vruntime. > + */ > + if (!se->on_rq && p->state != TASK_RUNNING) > + se->vruntime -= cfs_rq->min_vruntime; Here it's possible that se->vruntime is little than cfs_rq->min_vruntime. Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 4:18 ` Yong Zhang @ 2011-01-20 4:27 ` Yong Zhang 2011-01-20 5:32 ` Yong Zhang 2011-01-20 4:59 ` Mike Galbraith 1 sibling, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-20 4:27 UTC (permalink / raw) To: Peter Zijlstra Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt on Thu, Jan 20, 2011 at 12:18 PM, Yong Zhang <yong.zhang0@gmail.com> wrote: > On Wed, Jan 19, 2011 at 9:30 PM, Peter Zijlstra <peterz@infradead.org> wrote: >> +static void switched_from_fair(struct rq *rq, struct task_struct *p) >> +{ >> + struct sched_entity *se = &p->se; >> + struct cfs_rq *cfs_rq = cfs_rq_of(se); >> + >> + /* >> + * Ensure the task's vruntime is normalized, so that when its >> + * switched back to the fair class the enqueue_entity(.flags=0) will >> + * do the right thing. >> + * >> + * If it was on_rq, then the dequeue_entity(.flags=0) will already >> + * have normalized the vruntime, if it was !on_rq, then only when >> + * the task is sleeping will it still have non-normalized vruntime. >> + */ >> + if (!se->on_rq && p->state != TASK_RUNNING) >> + se->vruntime -= cfs_rq->min_vruntime; > > Here it's possible that se->vruntime is little than cfs_rq->min_vruntime. Maybe we can: place_entity(cfs_rq, se, 0); se->vruntime -= cfs_rq->min_vruntime; ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 4:27 ` Yong Zhang @ 2011-01-20 5:32 ` Yong Zhang 0 siblings, 0 replies; 42+ messages in thread From: Yong Zhang @ 2011-01-20 5:32 UTC (permalink / raw) To: Peter Zijlstra Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, Jan 20, 2011 at 12:27 PM, Yong Zhang <yong.zhang0@gmail.com> wrote: > on Thu, Jan 20, 2011 at 12:18 PM, Yong Zhang <yong.zhang0@gmail.com> wrote: >> On Wed, Jan 19, 2011 at 9:30 PM, Peter Zijlstra <peterz@infradead.org> wrote: >>> +static void switched_from_fair(struct rq *rq, struct task_struct *p) >>> +{ >>> + struct sched_entity *se = &p->se; >>> + struct cfs_rq *cfs_rq = cfs_rq_of(se); >>> + >>> + /* >>> + * Ensure the task's vruntime is normalized, so that when its >>> + * switched back to the fair class the enqueue_entity(.flags=0) will >>> + * do the right thing. >>> + * >>> + * If it was on_rq, then the dequeue_entity(.flags=0) will already >>> + * have normalized the vruntime, if it was !on_rq, then only when >>> + * the task is sleeping will it still have non-normalized vruntime. >>> + */ >>> + if (!se->on_rq && p->state != TASK_RUNNING) >>> + se->vruntime -= cfs_rq->min_vruntime; >> >> Here it's possible that se->vruntime is little than cfs_rq->min_vruntime. > > Maybe we can: > place_entity(cfs_rq, se, 0); > se->vruntime -= cfs_rq->min_vruntime; Hmmm, this doesn't make sense since we don't know how long the task will sleep after we do this. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 4:18 ` Yong Zhang 2011-01-20 4:27 ` Yong Zhang @ 2011-01-20 4:59 ` Mike Galbraith 2011-01-20 5:30 ` Yong Zhang 1 sibling, 1 reply; 42+ messages in thread From: Mike Galbraith @ 2011-01-20 4:59 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, 2011-01-20 at 12:18 +0800, Yong Zhang wrote: > On Wed, Jan 19, 2011 at 9:30 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > +static void switched_from_fair(struct rq *rq, struct task_struct *p) > > +{ > > + struct sched_entity *se = &p->se; > > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > > + > > + /* > > + * Ensure the task's vruntime is normalized, so that when its > > + * switched back to the fair class the enqueue_entity(.flags=0) will > > + * do the right thing. > > + * > > + * If it was on_rq, then the dequeue_entity(.flags=0) will already > > + * have normalized the vruntime, if it was !on_rq, then only when > > + * the task is sleeping will it still have non-normalized vruntime. > > + */ > > + if (!se->on_rq && p->state != TASK_RUNNING) > > + se->vruntime -= cfs_rq->min_vruntime; > > Here it's possible that se->vruntime is little than cfs_rq->min_vruntime. The idea of normalizing is to preserve the offset, so the task can be enqueued at this same offset, regardless of movement. For sleepers, we can't usually do this until enqueue time, lest their sleep time be unaccountable. Here, we have to normalize, so the task's sleep stops from the vantage point of fair_class upon class exit, and either resumes where it left off upon re-entry, or the task is enqueued at it's last offset as usual for runnable tasks. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 4:59 ` Mike Galbraith @ 2011-01-20 5:30 ` Yong Zhang 2011-01-20 6:12 ` Mike Galbraith 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-20 5:30 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, Jan 20, 2011 at 12:59 PM, Mike Galbraith <efault@gmx.de> wrote: > or the task is enqueued > at it's last offset as usual for runnable tasks. But shouldn't we task the tasks as WAKEUP, through the task has been waked on other sched_class? IOW, I wonder if we should play with place_entity() at some point. Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 5:30 ` Yong Zhang @ 2011-01-20 6:12 ` Mike Galbraith 2011-01-20 7:06 ` Yong Zhang 0 siblings, 1 reply; 42+ messages in thread From: Mike Galbraith @ 2011-01-20 6:12 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, 2011-01-20 at 13:30 +0800, Yong Zhang wrote: > On Thu, Jan 20, 2011 at 12:59 PM, Mike Galbraith <efault@gmx.de> wrote: > > or the task is enqueued > > at it's last offset as usual for runnable tasks. > > But shouldn't we task the tasks as WAKEUP, through the task has been > waked on other sched_class? We don't need to play any games with it afaiks, just normalize it, and the rest is taken care of automatically. > IOW, I wonder if we should play with place_entity() at some point. If the task returns as a sleeper, place entity() will be called when it is awakened, so it's sleep credit will be clipped as usual. So vruntime can be much less than min_vruntime at class exit time, and it doesn't matter, clipping on wakeup after re-entry takes care of it.. if that's what you were thinking about. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 6:12 ` Mike Galbraith @ 2011-01-20 7:06 ` Yong Zhang 2011-01-20 8:37 ` Mike Galbraith 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-20 7:06 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, Jan 20, 2011 at 2:12 PM, Mike Galbraith <efault@gmx.de> wrote: > If the task returns as a sleeper, place entity() will be called when it > is awakened, so it's sleep credit will be clipped as usual. So vruntime > can be much less than min_vruntime at class exit time, and it doesn't > matter, clipping on wakeup after re-entry takes care of it.. if that's > what you were thinking about. For a sleep task which stay in sched_fair before it's waked: try_to_wake_up() ttwu_activate() activate_task() enqueue_task_fair() enqueue_entity() place_entity() <== clip vruntime For a sleep task which promote to sched_rt when it's sleep: rt_mutex_setprio() check_class_changed() switch_from_fair() <== vruntime -= min_vruntime try_to_wake_up() ...run then stay on rq rt_mutex_setprio() enqueue_task_fair() <==vruntime += min_vruntime The difference is that in the second case, place_entity() is not called, but wrt sched_fair, the task is a WAKEUP task. Then we place this task in sched_fair before where it should be. Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 7:06 ` Yong Zhang @ 2011-01-20 8:37 ` Mike Galbraith 2011-01-20 9:07 ` Yong Zhang 0 siblings, 1 reply; 42+ messages in thread From: Mike Galbraith @ 2011-01-20 8:37 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, 2011-01-20 at 15:06 +0800, Yong Zhang wrote: > On Thu, Jan 20, 2011 at 2:12 PM, Mike Galbraith <efault@gmx.de> wrote: > > If the task returns as a sleeper, place entity() will be called when it > > is awakened, so it's sleep credit will be clipped as usual. So vruntime > > can be much less than min_vruntime at class exit time, and it doesn't > > matter, clipping on wakeup after re-entry takes care of it.. if that's > > what you were thinking about. > > For a sleep task which stay in sched_fair before it's waked: > try_to_wake_up() > ttwu_activate() > activate_task() > enqueue_task_fair() > enqueue_entity() > place_entity() <== clip vruntime > > For a sleep task which promote to sched_rt when it's sleep: > rt_mutex_setprio() > check_class_changed() > switch_from_fair() <== vruntime -= min_vruntime > try_to_wake_up() > ...run then stay on rq > rt_mutex_setprio() > enqueue_task_fair() <==vruntime += min_vruntime > > The difference is that in the second case, place_entity() is not > called, but wrt sched_fair, the task is a WAKEUP task. > Then we place this task in sched_fair before where it should be. D'oh. You're right, he needs to be clipped before he leaves. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 8:37 ` Mike Galbraith @ 2011-01-20 9:07 ` Yong Zhang 2011-01-20 10:07 ` Mike Galbraith 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-20 9:07 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, Jan 20, 2011 at 4:37 PM, Mike Galbraith <efault@gmx.de> wrote: > On Thu, 2011-01-20 at 15:06 +0800, Yong Zhang wrote: >> On Thu, Jan 20, 2011 at 2:12 PM, Mike Galbraith <efault@gmx.de> wrote: >> > If the task returns as a sleeper, place entity() will be called when it >> > is awakened, so it's sleep credit will be clipped as usual. So vruntime >> > can be much less than min_vruntime at class exit time, and it doesn't >> > matter, clipping on wakeup after re-entry takes care of it.. if that's >> > what you were thinking about. >> >> For a sleep task which stay in sched_fair before it's waked: >> try_to_wake_up() >> ttwu_activate() >> activate_task() >> enqueue_task_fair() >> enqueue_entity() >> place_entity() <== clip vruntime >> >> For a sleep task which promote to sched_rt when it's sleep: >> rt_mutex_setprio() >> check_class_changed() >> switch_from_fair() <== vruntime -= min_vruntime >> try_to_wake_up() >> ...run then stay on rq >> rt_mutex_setprio() >> enqueue_task_fair() <==vruntime += min_vruntime >> >> The difference is that in the second case, place_entity() is not >> called, but wrt sched_fair, the task is a WAKEUP task. >> Then we place this task in sched_fair before where it should be. > > D'oh. You're right, he needs to be clipped before he leaves. Exactly we should clip it when it comes back, because it still could sleep for some time after it leaves ;) Thanks, Yong -- Only stand for myself ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 9:07 ` Yong Zhang @ 2011-01-20 10:07 ` Mike Galbraith 2011-01-21 11:08 ` Peter Zijlstra 2011-01-21 13:15 ` Yong Zhang 0 siblings, 2 replies; 42+ messages in thread From: Mike Galbraith @ 2011-01-20 10:07 UTC (permalink / raw) To: Yong Zhang Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, 2011-01-20 at 17:07 +0800, Yong Zhang wrote: > On Thu, Jan 20, 2011 at 4:37 PM, Mike Galbraith <efault@gmx.de> wrote: > > On Thu, 2011-01-20 at 15:06 +0800, Yong Zhang wrote: > >> On Thu, Jan 20, 2011 at 2:12 PM, Mike Galbraith <efault@gmx.de> wrote: > >> > If the task returns as a sleeper, place entity() will be called when it > >> > is awakened, so it's sleep credit will be clipped as usual. So vruntime > >> > can be much less than min_vruntime at class exit time, and it doesn't > >> > matter, clipping on wakeup after re-entry takes care of it.. if that's > >> > what you were thinking about. > >> > >> For a sleep task which stay in sched_fair before it's waked: > >> try_to_wake_up() > >> ttwu_activate() > >> activate_task() > >> enqueue_task_fair() > >> enqueue_entity() > >> place_entity() <== clip vruntime > >> > >> For a sleep task which promote to sched_rt when it's sleep: > >> rt_mutex_setprio() > >> check_class_changed() > >> switch_from_fair() <== vruntime -= min_vruntime > >> try_to_wake_up() > >> ...run then stay on rq > >> rt_mutex_setprio() > >> enqueue_task_fair() <==vruntime += min_vruntime > >> > >> The difference is that in the second case, place_entity() is not > >> called, but wrt sched_fair, the task is a WAKEUP task. > >> Then we place this task in sched_fair before where it should be. > > > > D'oh. You're right, he needs to be clipped before he leaves. > > Exactly we should clip it when it comes back, because it still could > sleep for some time after it leaves ;) That's ok, we don't and aren't supposed to care what happens while he's gone. But we do have to make sure that vruntime is sane either when he leaves, or when he comes back. Seems to me the easiest is clip when he leaves to cover him having slept a long time before leaving, then coming back on us as a runner. If he comes back as a sleeper, he'll be clipped again anyway, so all is well. sched_fork() should probably zero child's vruntime too, so non-fair children can't enter fair_class with some bogus lag they never had. -Mike ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 10:07 ` Mike Galbraith @ 2011-01-21 11:08 ` Peter Zijlstra 2011-01-21 12:24 ` Yong Zhang 2011-01-21 13:15 ` Yong Zhang 1 sibling, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2011-01-21 11:08 UTC (permalink / raw) To: Mike Galbraith Cc: Yong Zhang, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, 2011-01-20 at 11:07 +0100, Mike Galbraith wrote: > On Thu, 2011-01-20 at 17:07 +0800, Yong Zhang wrote: > > On Thu, Jan 20, 2011 at 4:37 PM, Mike Galbraith <efault@gmx.de> wrote: > > > On Thu, 2011-01-20 at 15:06 +0800, Yong Zhang wrote: > > >> On Thu, Jan 20, 2011 at 2:12 PM, Mike Galbraith <efault@gmx.de> wrote: > > >> > If the task returns as a sleeper, place entity() will be called when it > > >> > is awakened, so it's sleep credit will be clipped as usual. So vruntime > > >> > can be much less than min_vruntime at class exit time, and it doesn't > > >> > matter, clipping on wakeup after re-entry takes care of it.. if that's > > >> > what you were thinking about. > > >> > > >> For a sleep task which stay in sched_fair before it's waked: > > >> try_to_wake_up() > > >> ttwu_activate() > > >> activate_task() > > >> enqueue_task_fair() > > >> enqueue_entity() > > >> place_entity() <== clip vruntime > > >> > > >> For a sleep task which promote to sched_rt when it's sleep: > > >> rt_mutex_setprio() > > >> check_class_changed() > > >> switch_from_fair() <== vruntime -= min_vruntime > > >> try_to_wake_up() > > >> ...run then stay on rq > > >> rt_mutex_setprio() > > >> enqueue_task_fair() <==vruntime += min_vruntime > > >> > > >> The difference is that in the second case, place_entity() is not > > >> called, but wrt sched_fair, the task is a WAKEUP task. > > >> Then we place this task in sched_fair before where it should be. > > > > > > D'oh. You're right, he needs to be clipped before he leaves. > > > > Exactly we should clip it when it comes back, because it still could > > sleep for some time after it leaves ;) > > That's ok, we don't and aren't supposed to care what happens while he's > gone. But we do have to make sure that vruntime is sane either when he > leaves, or when he comes back. Seems to me the easiest is clip when he > leaves to cover him having slept a long time before leaving, then coming > back on us as a runner. If he comes back as a sleeper, he'll be clipped > again anyway, so all is well. > > sched_fork() should probably zero child's vruntime too, so non-fair > children can't enter fair_class with some bogus lag they never had. Something like so? Index: linux-2.6/kernel/sched.c =================================================================== --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -2624,6 +2624,8 @@ void sched_fork(struct task_struct *p, i if (!rt_prio(p->prio)) p->sched_class = &fair_sched_class; + else + p->se.vruntime = 0; if (p->sched_class->task_fork) p->sched_class->task_fork(p); Index: linux-2.6/kernel/sched_fair.c =================================================================== --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -4086,8 +4086,14 @@ static void switched_from_fair(struct rq * have normalized the vruntime, if it was !on_rq, then only when * the task is sleeping will it still have non-normalized vruntime. */ - if (!se->on_rq && p->state != TASK_RUNNING) + if (!se->on_rq && p->state != TASK_RUNNING) { + /* + * Fix up our vruntime so that the current sleep doesn't + * cause 'unlimited' sleep bonus. + */ + place_entity(cfs_rq, se, 0); se->vruntime -= cfs_rq->min_vruntime; + } } /* ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-21 11:08 ` Peter Zijlstra @ 2011-01-21 12:24 ` Yong Zhang 2011-01-21 13:40 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-21 12:24 UTC (permalink / raw) To: Peter Zijlstra Cc: Mike Galbraith, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Fri, Jan 21, 2011 at 12:08:56PM +0100, Peter Zijlstra wrote: > > That's ok, we don't and aren't supposed to care what happens while he's > > gone. But we do have to make sure that vruntime is sane either when he > > leaves, or when he comes back. Seems to me the easiest is clip when he > > leaves to cover him having slept a long time before leaving, then coming > > back on us as a runner. If he comes back as a sleeper, he'll be clipped > > again anyway, so all is well. > > > > sched_fork() should probably zero child's vruntime too, so non-fair > > children can't enter fair_class with some bogus lag they never had. > > Something like so? > > Index: linux-2.6/kernel/sched.c > =================================================================== > --- linux-2.6.orig/kernel/sched.c > +++ linux-2.6/kernel/sched.c > @@ -2624,6 +2624,8 @@ void sched_fork(struct task_struct *p, i > > if (!rt_prio(p->prio)) > p->sched_class = &fair_sched_class; > + else > + p->se.vruntime = 0; This can be moved to __sched_fork() > > if (p->sched_class->task_fork) > p->sched_class->task_fork(p); > Index: linux-2.6/kernel/sched_fair.c > =================================================================== > --- linux-2.6.orig/kernel/sched_fair.c > +++ linux-2.6/kernel/sched_fair.c > @@ -4086,8 +4086,14 @@ static void switched_from_fair(struct rq > * have normalized the vruntime, if it was !on_rq, then only when > * the task is sleeping will it still have non-normalized vruntime. > */ > - if (!se->on_rq && p->state != TASK_RUNNING) > + if (!se->on_rq && p->state != TASK_RUNNING) { > + /* > + * Fix up our vruntime so that the current sleep doesn't > + * cause 'unlimited' sleep bonus. > + */ > + place_entity(cfs_rq, se, 0); > se->vruntime -= cfs_rq->min_vruntime; Now I will say yes. Though it's same to my suggestion which was rejected by myself :) Thanks, Yong ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-21 12:24 ` Yong Zhang @ 2011-01-21 13:40 ` Peter Zijlstra 2011-01-21 15:03 ` Yong Zhang 0 siblings, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2011-01-21 13:40 UTC (permalink / raw) To: Yong Zhang Cc: Mike Galbraith, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Fri, 2011-01-21 at 20:24 +0800, Yong Zhang wrote: > > Index: linux-2.6/kernel/sched.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched.c > > +++ linux-2.6/kernel/sched.c > > @@ -2624,6 +2624,8 @@ void sched_fork(struct task_struct *p, i > > > > if (!rt_prio(p->prio)) > > p->sched_class = &fair_sched_class; > > + else > > + p->se.vruntime = 0; > > This can be moved to __sched_fork() But we cannot do it unconditionally, we want to inherit the parent's ->vruntime when we're a fair task clone. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-21 13:40 ` Peter Zijlstra @ 2011-01-21 15:03 ` Yong Zhang 2011-01-21 15:10 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: Yong Zhang @ 2011-01-21 15:03 UTC (permalink / raw) To: Peter Zijlstra Cc: Mike Galbraith, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Fri, Jan 21, 2011 at 02:40:58PM +0100, Peter Zijlstra wrote: > On Fri, 2011-01-21 at 20:24 +0800, Yong Zhang wrote: > > > Index: linux-2.6/kernel/sched.c > > > =================================================================== > > > --- linux-2.6.orig/kernel/sched.c > > > +++ linux-2.6/kernel/sched.c > > > @@ -2624,6 +2624,8 @@ void sched_fork(struct task_struct *p, i > > > > > > if (!rt_prio(p->prio)) > > > p->sched_class = &fair_sched_class; > > > + else > > > + p->se.vruntime = 0; > > > > This can be moved to __sched_fork() > > But we cannot do it unconditionally, we want to inherit the parent's > ->vruntime when we're a fair task clone. Doesn't task_fork_fair()(which is called after __sched_fork()) take the inherit job? Am I missing something? Thanks, Yong ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-21 15:03 ` Yong Zhang @ 2011-01-21 15:10 ` Peter Zijlstra 0 siblings, 0 replies; 42+ messages in thread From: Peter Zijlstra @ 2011-01-21 15:10 UTC (permalink / raw) To: Yong Zhang Cc: Mike Galbraith, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Fri, 2011-01-21 at 23:03 +0800, Yong Zhang wrote: > On Fri, Jan 21, 2011 at 02:40:58PM +0100, Peter Zijlstra wrote: > > On Fri, 2011-01-21 at 20:24 +0800, Yong Zhang wrote: > > > > Index: linux-2.6/kernel/sched.c > > > > =================================================================== > > > > --- linux-2.6.orig/kernel/sched.c > > > > +++ linux-2.6/kernel/sched.c > > > > @@ -2624,6 +2624,8 @@ void sched_fork(struct task_struct *p, i > > > > > > > > if (!rt_prio(p->prio)) > > > > p->sched_class = &fair_sched_class; > > > > + else > > > > + p->se.vruntime = 0; > > > > > > This can be moved to __sched_fork() > > > > But we cannot do it unconditionally, we want to inherit the parent's > > ->vruntime when we're a fair task clone. > > Doesn't task_fork_fair()(which is called after __sched_fork()) > take the inherit job? > > Am I missing something? Ah, indeed, we do the place_entity(.initial=1) thing there unconditionally. For some reason I thought we relied on the inherited vruntime. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 10:07 ` Mike Galbraith 2011-01-21 11:08 ` Peter Zijlstra @ 2011-01-21 13:15 ` Yong Zhang 1 sibling, 0 replies; 42+ messages in thread From: Yong Zhang @ 2011-01-21 13:15 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, Jan 20, 2011 at 11:07:27AM +0100, Mike Galbraith wrote: > On Thu, 2011-01-20 at 17:07 +0800, Yong Zhang wrote: > > On Thu, Jan 20, 2011 at 4:37 PM, Mike Galbraith <efault@gmx.de> wrote: > > > On Thu, 2011-01-20 at 15:06 +0800, Yong Zhang wrote: > > >> On Thu, Jan 20, 2011 at 2:12 PM, Mike Galbraith <efault@gmx.de> wrote: > > >> > If the task returns as a sleeper, place entity() will be called when it > > >> > is awakened, so it's sleep credit will be clipped as usual. So vruntime > > >> > can be much less than min_vruntime at class exit time, and it doesn't > > >> > matter, clipping on wakeup after re-entry takes care of it.. if that's > > >> > what you were thinking about. > > >> > > >> For a sleep task which stay in sched_fair before it's waked: > > >> try_to_wake_up() > > >> ttwu_activate() > > >> activate_task() > > >> enqueue_task_fair() > > >> enqueue_entity() > > >> place_entity() <== clip vruntime > > >> > > >> For a sleep task which promote to sched_rt when it's sleep: > > >> rt_mutex_setprio() > > >> check_class_changed() > > >> switch_from_fair() <== vruntime -= min_vruntime > > >> try_to_wake_up() > > >> ...run then stay on rq > > >> rt_mutex_setprio() > > >> enqueue_task_fair() <==vruntime += min_vruntime > > >> > > >> The difference is that in the second case, place_entity() is not > > >> called, but wrt sched_fair, the task is a WAKEUP task. > > >> Then we place this task in sched_fair before where it should be. > > > > > > D'oh. You're right, he needs to be clipped before he leaves. > > > > Exactly we should clip it when it comes back, because it still could > > sleep for some time after it leaves ;) > > That's ok, we don't and aren't supposed to care what happens while he's > gone. But we do have to make sure that vruntime is sane either when he > leaves, or when he comes back. Seems to me the easiest is clip when he > leaves to cover him having slept a long time before leaving, then coming > back on us as a runner. If he comes back as a sleeper, he'll be clipped > again anyway, so all is well. OK, fair enough. > > sched_fork() should probably zero child's vruntime too, so non-fair > children can't enter fair_class with some bogus lag they never had. Yeah, So I think Peter's new patch is taking care of all the issue now. Thanks, Yong ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 13:30 ` Peter Zijlstra 2011-01-20 4:18 ` Yong Zhang @ 2011-01-20 7:07 ` Onkalo Samu 2011-01-21 6:25 ` Onkalo Samu 1 sibling, 1 reply; 42+ messages in thread From: Onkalo Samu @ 2011-01-20 7:07 UTC (permalink / raw) To: ext Peter Zijlstra Cc: Yong Zhang, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Wed, 2011-01-19 at 14:30 +0100, ext Peter Zijlstra wrote: > On Wed, 2011-01-19 at 15:13 +0200, Onkalo Samu wrote: > > > > When the task is sleeping rt_mutex_setprio > > doesn't call check_class_changed since the task is not > > in queue at that moment I think. > > Ah, indeed, more changes then.. > > (I think we can also remove the prio_changed_idle thing, since I think > we killed that hotplug usage) Doesn't compile because kernel/sched_idletask.c functions still use "running" parameter which was removed. > > > --- > Index: linux-2.6/kernel/sched.c > =================================================================== > --- linux-2.6.orig/kernel/sched.c > +++ linux-2.6/kernel/sched.c > @@ -2025,14 +2025,14 @@ inline int task_curr(const struct task_s > > static inline void check_class_changed(struct rq *rq, struct task_struct *p, > const struct sched_class *prev_class, > - int oldprio, int running) > + int oldprio) > { > if (prev_class != p->sched_class) { > if (prev_class->switched_from) > - prev_class->switched_from(rq, p, running); > - p->sched_class->switched_to(rq, p, running); > - } else > - p->sched_class->prio_changed(rq, p, oldprio, running); > + prev_class->switched_from(rq, p); > + p->sched_class->switched_to(rq, p); > + } else if (oldprio != p->prio) > + p->sched_class->prio_changed(rq, p, oldprio); > } > > static void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) > @@ -4570,11 +4570,10 @@ void rt_mutex_setprio(struct task_struct > > if (running) > p->sched_class->set_curr_task(rq); > - if (on_rq) { > + if (on_rq) > enqueue_task(rq, p, oldprio < prio ? ENQUEUE_HEAD : 0); > > - check_class_changed(rq, p, prev_class, oldprio, running); > - } > + check_class_changed(rq, p, prev_class, oldprio); > task_rq_unlock(rq, &flags); > } > > @@ -4902,11 +4901,10 @@ static int __sched_setscheduler(struct t > > if (running) > p->sched_class->set_curr_task(rq); > - if (on_rq) { > + if (on_rq) > activate_task(rq, p, 0); > > - check_class_changed(rq, p, prev_class, oldprio, running); > - } > + check_class_changed(rq, p, prev_class, oldprio); > __task_rq_unlock(rq); > raw_spin_unlock_irqrestore(&p->pi_lock, flags); > > @@ -8109,6 +8107,8 @@ EXPORT_SYMBOL(__might_sleep); > #ifdef CONFIG_MAGIC_SYSRQ > static void normalize_task(struct rq *rq, struct task_struct *p) > { > + struct sched_class *prev_class = p->sched_class; > + int old_prio = p->prio; > int on_rq; > > on_rq = p->se.on_rq; > @@ -8119,6 +8119,8 @@ static void normalize_task(struct rq *rq > activate_task(rq, p, 0); > resched_task(rq->curr); > } > + > + check_class_changed(rq, p, prev_class, old_prio); > } > > void normalize_rt_tasks(void) > Index: linux-2.6/kernel/sched_fair.c > =================================================================== > --- linux-2.6.orig/kernel/sched_fair.c > +++ linux-2.6/kernel/sched_fair.c > @@ -4054,33 +4054,56 @@ static void task_fork_fair(struct task_s > * Priority of the task has changed. Check to see if we preempt > * the current task. > */ > -static void prio_changed_fair(struct rq *rq, struct task_struct *p, > - int oldprio, int running) > +static void > +prio_changed_fair(struct rq *rq, struct task_struct *p, int oldprio) > { > + if (!p->se.on_rq) > + return; > + > /* > * Reschedule if we are currently running on this runqueue and > * our priority decreased, or if we are not currently running on > * this runqueue and our priority is higher than the current's > */ > - if (running) { > + if (rq->curr == p) { > if (p->prio > oldprio) > resched_task(rq->curr); > } else > check_preempt_curr(rq, p, 0); > } > > +static void switched_from_fair(struct rq *rq, struct task_struct *p) > +{ > + struct sched_entity *se = &p->se; > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > + > + /* > + * Ensure the task's vruntime is normalized, so that when its > + * switched back to the fair class the enqueue_entity(.flags=0) will > + * do the right thing. > + * > + * If it was on_rq, then the dequeue_entity(.flags=0) will already > + * have normalized the vruntime, if it was !on_rq, then only when > + * the task is sleeping will it still have non-normalized vruntime. > + */ > + if (!se->on_rq && p->state != TASK_RUNNING) > + se->vruntime -= cfs_rq->min_vruntime; > +} > + > /* > * We switched to the sched_fair class. > */ > -static void switched_to_fair(struct rq *rq, struct task_struct *p, > - int running) > +static void switched_to_fair(struct rq *rq, struct task_struct *p) > { > + if (!p->se.on_rq) > + return; > + > /* > * We were most likely switched from sched_rt, so > * kick off the schedule if running, otherwise just see > * if we can still preempt the current task. > */ > - if (running) > + if (rq->curr == p) > resched_task(rq->curr); > else > check_preempt_curr(rq, p, 0); > @@ -4166,6 +4189,7 @@ static const struct sched_class fair_sch > .task_fork = task_fork_fair, > > .prio_changed = prio_changed_fair, > + .switched_from = switched_from_fair, > .switched_to = switched_to_fair, > > .get_rr_interval = get_rr_interval_fair, > Index: linux-2.6/include/linux/sched.h > =================================================================== > --- linux-2.6.orig/include/linux/sched.h > +++ linux-2.6/include/linux/sched.h > @@ -1084,12 +1084,10 @@ struct sched_class { > void (*task_tick) (struct rq *rq, struct task_struct *p, int queued); > void (*task_fork) (struct task_struct *p); > > - void (*switched_from) (struct rq *this_rq, struct task_struct *task, > - int running); > - void (*switched_to) (struct rq *this_rq, struct task_struct *task, > - int running); > + void (*switched_from) (struct rq *this_rq, struct task_struct *task); > + void (*switched_to) (struct rq *this_rq, struct task_struct *task); > void (*prio_changed) (struct rq *this_rq, struct task_struct *task, > - int oldprio, int running); > + int oldprio); > > unsigned int (*get_rr_interval) (struct rq *rq, > struct task_struct *task); > Index: linux-2.6/kernel/sched_idletask.c > =================================================================== > --- linux-2.6.orig/kernel/sched_idletask.c > +++ linux-2.6/kernel/sched_idletask.c > @@ -52,8 +52,7 @@ static void set_curr_task_idle(struct rq > { > } > > -static void switched_to_idle(struct rq *rq, struct task_struct *p, > - int running) > +static void switched_to_idle(struct rq *rq, struct task_struct *p) > { > /* Can this actually happen?? */ > if (running) > @@ -62,8 +61,8 @@ static void switched_to_idle(struct rq * > check_preempt_curr(rq, p, 0); > } > > -static void prio_changed_idle(struct rq *rq, struct task_struct *p, > - int oldprio, int running) > +static void > +prio_changed_idle(struct rq *rq, struct task_struct *p, int oldprio) > { > /* This can happen for hot plug CPUS */ > > Index: linux-2.6/kernel/sched_rt.c > =================================================================== > --- linux-2.6.orig/kernel/sched_rt.c > +++ linux-2.6/kernel/sched_rt.c > @@ -1595,8 +1595,7 @@ static void rq_offline_rt(struct rq *rq) > * When switch from the rt queue, we bring ourselves to a position > * that we might want to pull RT tasks from other runqueues. > */ > -static void switched_from_rt(struct rq *rq, struct task_struct *p, > - int running) > +static void switched_from_rt(struct rq *rq, struct task_struct *p) > { > /* > * If there are other RT tasks then we will reschedule > @@ -1605,7 +1604,7 @@ static void switched_from_rt(struct rq * > * we may need to handle the pulling of RT tasks > * now. > */ > - if (!rq->rt.rt_nr_running) > + if (p->se.on_rq && !rq->rt.rt_nr_running) > pull_rt_task(rq); > } > > @@ -1624,8 +1623,7 @@ static inline void init_sched_rt_class(v > * with RT tasks. In this case we try to push them off to > * other runqueues. > */ > -static void switched_to_rt(struct rq *rq, struct task_struct *p, > - int running) > +static void switched_to_rt(struct rq *rq, struct task_struct *p) > { > int check_resched = 1; > > @@ -1636,7 +1634,7 @@ static void switched_to_rt(struct rq *rq > * If that current running task is also an RT task > * then see if we can move to another run queue. > */ > - if (!running) { > + if (p->se.on_rq && rq->curr != p) { > #ifdef CONFIG_SMP > if (rq->rt.overloaded && push_rt_task(rq) && > /* Don't resched if we changed runqueues */ > @@ -1652,10 +1650,13 @@ static void switched_to_rt(struct rq *rq > * Priority of the task has changed. This may cause > * us to initiate a push or pull. > */ > -static void prio_changed_rt(struct rq *rq, struct task_struct *p, > - int oldprio, int running) > +static void > +prio_changed_rt(struct rq *rq, struct task_struct *p, int oldprio) > { > - if (running) { > + if (!p->se.on_rq) > + return; > + > + if (rq->curr == p) { > #ifdef CONFIG_SMP > /* > * If our priority decreases while running, we > Index: linux-2.6/kernel/sched_stoptask.c > =================================================================== > --- linux-2.6.orig/kernel/sched_stoptask.c > +++ linux-2.6/kernel/sched_stoptask.c > @@ -59,14 +59,13 @@ static void set_curr_task_stop(struct rq > { > } > > -static void switched_to_stop(struct rq *rq, struct task_struct *p, > - int running) > +static void switched_to_stop(struct rq *rq, struct task_struct *p) > { > BUG(); /* its impossible to change to this class */ > } > > -static void prio_changed_stop(struct rq *rq, struct task_struct *p, > - int oldprio, int running) > +static void > +prio_changed_stop(struct rq *rq, struct task_struct *p, int oldprio) > { > BUG(); /* how!?, what priority? */ > } > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-20 7:07 ` Onkalo Samu @ 2011-01-21 6:25 ` Onkalo Samu 0 siblings, 0 replies; 42+ messages in thread From: Onkalo Samu @ 2011-01-21 6:25 UTC (permalink / raw) To: ext Peter Zijlstra Cc: Yong Zhang, mingo, linux-kernel@vger.kernel.org, tglx, Steven Rostedt On Thu, 2011-01-20 at 09:07 +0200, Onkalo Samu wrote: > On Wed, 2011-01-19 at 14:30 +0100, ext Peter Zijlstra wrote: > > On Wed, 2011-01-19 at 15:13 +0200, Onkalo Samu wrote: > > > > > > When the task is sleeping rt_mutex_setprio > > > doesn't call check_class_changed since the task is not > > > in queue at that moment I think. > > > > Ah, indeed, more changes then.. > > > > (I think we can also remove the prio_changed_idle thing, since I think > > we killed that hotplug usage) > > Doesn't compile because kernel/sched_idletask.c > functions still use "running" parameter which was removed. > I made similar corrections to sched_idletask.c as you did for other functions. Seems to work for us. -Samu > > > > > > > --- > > Index: linux-2.6/kernel/sched.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched.c > > +++ linux-2.6/kernel/sched.c > > @@ -2025,14 +2025,14 @@ inline int task_curr(const struct task_s > > > > static inline void check_class_changed(struct rq *rq, struct task_struct *p, > > const struct sched_class *prev_class, > > - int oldprio, int running) > > + int oldprio) > > { > > if (prev_class != p->sched_class) { > > if (prev_class->switched_from) > > - prev_class->switched_from(rq, p, running); > > - p->sched_class->switched_to(rq, p, running); > > - } else > > - p->sched_class->prio_changed(rq, p, oldprio, running); > > + prev_class->switched_from(rq, p); > > + p->sched_class->switched_to(rq, p); > > + } else if (oldprio != p->prio) > > + p->sched_class->prio_changed(rq, p, oldprio); > > } > > > > static void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) > > @@ -4570,11 +4570,10 @@ void rt_mutex_setprio(struct task_struct > > > > if (running) > > p->sched_class->set_curr_task(rq); > > - if (on_rq) { > > + if (on_rq) > > enqueue_task(rq, p, oldprio < prio ? ENQUEUE_HEAD : 0); > > > > - check_class_changed(rq, p, prev_class, oldprio, running); > > - } > > + check_class_changed(rq, p, prev_class, oldprio); > > task_rq_unlock(rq, &flags); > > } > > > > @@ -4902,11 +4901,10 @@ static int __sched_setscheduler(struct t > > > > if (running) > > p->sched_class->set_curr_task(rq); > > - if (on_rq) { > > + if (on_rq) > > activate_task(rq, p, 0); > > > > - check_class_changed(rq, p, prev_class, oldprio, running); > > - } > > + check_class_changed(rq, p, prev_class, oldprio); > > __task_rq_unlock(rq); > > raw_spin_unlock_irqrestore(&p->pi_lock, flags); > > > > @@ -8109,6 +8107,8 @@ EXPORT_SYMBOL(__might_sleep); > > #ifdef CONFIG_MAGIC_SYSRQ > > static void normalize_task(struct rq *rq, struct task_struct *p) > > { > > + struct sched_class *prev_class = p->sched_class; > > + int old_prio = p->prio; > > int on_rq; > > > > on_rq = p->se.on_rq; > > @@ -8119,6 +8119,8 @@ static void normalize_task(struct rq *rq > > activate_task(rq, p, 0); > > resched_task(rq->curr); > > } > > + > > + check_class_changed(rq, p, prev_class, old_prio); > > } > > > > void normalize_rt_tasks(void) > > Index: linux-2.6/kernel/sched_fair.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_fair.c > > +++ linux-2.6/kernel/sched_fair.c > > @@ -4054,33 +4054,56 @@ static void task_fork_fair(struct task_s > > * Priority of the task has changed. Check to see if we preempt > > * the current task. > > */ > > -static void prio_changed_fair(struct rq *rq, struct task_struct *p, > > - int oldprio, int running) > > +static void > > +prio_changed_fair(struct rq *rq, struct task_struct *p, int oldprio) > > { > > + if (!p->se.on_rq) > > + return; > > + > > /* > > * Reschedule if we are currently running on this runqueue and > > * our priority decreased, or if we are not currently running on > > * this runqueue and our priority is higher than the current's > > */ > > - if (running) { > > + if (rq->curr == p) { > > if (p->prio > oldprio) > > resched_task(rq->curr); > > } else > > check_preempt_curr(rq, p, 0); > > } > > > > +static void switched_from_fair(struct rq *rq, struct task_struct *p) > > +{ > > + struct sched_entity *se = &p->se; > > + struct cfs_rq *cfs_rq = cfs_rq_of(se); > > + > > + /* > > + * Ensure the task's vruntime is normalized, so that when its > > + * switched back to the fair class the enqueue_entity(.flags=0) will > > + * do the right thing. > > + * > > + * If it was on_rq, then the dequeue_entity(.flags=0) will already > > + * have normalized the vruntime, if it was !on_rq, then only when > > + * the task is sleeping will it still have non-normalized vruntime. > > + */ > > + if (!se->on_rq && p->state != TASK_RUNNING) > > + se->vruntime -= cfs_rq->min_vruntime; > > +} > > + > > /* > > * We switched to the sched_fair class. > > */ > > -static void switched_to_fair(struct rq *rq, struct task_struct *p, > > - int running) > > +static void switched_to_fair(struct rq *rq, struct task_struct *p) > > { > > + if (!p->se.on_rq) > > + return; > > + > > /* > > * We were most likely switched from sched_rt, so > > * kick off the schedule if running, otherwise just see > > * if we can still preempt the current task. > > */ > > - if (running) > > + if (rq->curr == p) > > resched_task(rq->curr); > > else > > check_preempt_curr(rq, p, 0); > > @@ -4166,6 +4189,7 @@ static const struct sched_class fair_sch > > .task_fork = task_fork_fair, > > > > .prio_changed = prio_changed_fair, > > + .switched_from = switched_from_fair, > > .switched_to = switched_to_fair, > > > > .get_rr_interval = get_rr_interval_fair, > > Index: linux-2.6/include/linux/sched.h > > =================================================================== > > --- linux-2.6.orig/include/linux/sched.h > > +++ linux-2.6/include/linux/sched.h > > @@ -1084,12 +1084,10 @@ struct sched_class { > > void (*task_tick) (struct rq *rq, struct task_struct *p, int queued); > > void (*task_fork) (struct task_struct *p); > > > > - void (*switched_from) (struct rq *this_rq, struct task_struct *task, > > - int running); > > - void (*switched_to) (struct rq *this_rq, struct task_struct *task, > > - int running); > > + void (*switched_from) (struct rq *this_rq, struct task_struct *task); > > + void (*switched_to) (struct rq *this_rq, struct task_struct *task); > > void (*prio_changed) (struct rq *this_rq, struct task_struct *task, > > - int oldprio, int running); > > + int oldprio); > > > > unsigned int (*get_rr_interval) (struct rq *rq, > > struct task_struct *task); > > Index: linux-2.6/kernel/sched_idletask.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_idletask.c > > +++ linux-2.6/kernel/sched_idletask.c > > @@ -52,8 +52,7 @@ static void set_curr_task_idle(struct rq > > { > > } > > > > -static void switched_to_idle(struct rq *rq, struct task_struct *p, > > - int running) > > +static void switched_to_idle(struct rq *rq, struct task_struct *p) > > { > > /* Can this actually happen?? */ > > if (running) > > @@ -62,8 +61,8 @@ static void switched_to_idle(struct rq * > > check_preempt_curr(rq, p, 0); > > } > > > > -static void prio_changed_idle(struct rq *rq, struct task_struct *p, > > - int oldprio, int running) > > +static void > > +prio_changed_idle(struct rq *rq, struct task_struct *p, int oldprio) > > { > > /* This can happen for hot plug CPUS */ > > > > Index: linux-2.6/kernel/sched_rt.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_rt.c > > +++ linux-2.6/kernel/sched_rt.c > > @@ -1595,8 +1595,7 @@ static void rq_offline_rt(struct rq *rq) > > * When switch from the rt queue, we bring ourselves to a position > > * that we might want to pull RT tasks from other runqueues. > > */ > > -static void switched_from_rt(struct rq *rq, struct task_struct *p, > > - int running) > > +static void switched_from_rt(struct rq *rq, struct task_struct *p) > > { > > /* > > * If there are other RT tasks then we will reschedule > > @@ -1605,7 +1604,7 @@ static void switched_from_rt(struct rq * > > * we may need to handle the pulling of RT tasks > > * now. > > */ > > - if (!rq->rt.rt_nr_running) > > + if (p->se.on_rq && !rq->rt.rt_nr_running) > > pull_rt_task(rq); > > } > > > > @@ -1624,8 +1623,7 @@ static inline void init_sched_rt_class(v > > * with RT tasks. In this case we try to push them off to > > * other runqueues. > > */ > > -static void switched_to_rt(struct rq *rq, struct task_struct *p, > > - int running) > > +static void switched_to_rt(struct rq *rq, struct task_struct *p) > > { > > int check_resched = 1; > > > > @@ -1636,7 +1634,7 @@ static void switched_to_rt(struct rq *rq > > * If that current running task is also an RT task > > * then see if we can move to another run queue. > > */ > > - if (!running) { > > + if (p->se.on_rq && rq->curr != p) { > > #ifdef CONFIG_SMP > > if (rq->rt.overloaded && push_rt_task(rq) && > > /* Don't resched if we changed runqueues */ > > @@ -1652,10 +1650,13 @@ static void switched_to_rt(struct rq *rq > > * Priority of the task has changed. This may cause > > * us to initiate a push or pull. > > */ > > -static void prio_changed_rt(struct rq *rq, struct task_struct *p, > > - int oldprio, int running) > > +static void > > +prio_changed_rt(struct rq *rq, struct task_struct *p, int oldprio) > > { > > - if (running) { > > + if (!p->se.on_rq) > > + return; > > + > > + if (rq->curr == p) { > > #ifdef CONFIG_SMP > > /* > > * If our priority decreases while running, we > > Index: linux-2.6/kernel/sched_stoptask.c > > =================================================================== > > --- linux-2.6.orig/kernel/sched_stoptask.c > > +++ linux-2.6/kernel/sched_stoptask.c > > @@ -59,14 +59,13 @@ static void set_curr_task_stop(struct rq > > { > > } > > > > -static void switched_to_stop(struct rq *rq, struct task_struct *p, > > - int running) > > +static void switched_to_stop(struct rq *rq, struct task_struct *p) > > { > > BUG(); /* its impossible to change to this class */ > > } > > > > -static void prio_changed_stop(struct rq *rq, struct task_struct *p, > > - int oldprio, int running) > > +static void > > +prio_changed_stop(struct rq *rq, struct task_struct *p, int oldprio) > > { > > BUG(); /* how!?, what priority? */ > > } > > > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Bug in scheduler when using rt_mutex 2011-01-19 9:44 ` Peter Zijlstra 2011-01-19 10:38 ` Peter Zijlstra @ 2011-01-20 3:10 ` Yong Zhang 1 sibling, 0 replies; 42+ messages in thread From: Yong Zhang @ 2011-01-20 3:10 UTC (permalink / raw) To: Peter Zijlstra; +Cc: samu.p.onkalo, mingo, linux-kernel@vger.kernel.org, tglx On Wed, Jan 19, 2011 at 5:44 PM, Peter Zijlstra <peterz@infradead.org> wrote: > more or less, the idea is that we only call __{dequeue,enqueue}_entity() > when the task is actually in the tree and current is not. Sure ;) ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2011-01-21 15:10 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-17 14:42 Bug in scheduler when using rt_mutex Onkalo Samu 2011-01-17 15:00 ` Peter Zijlstra 2011-01-17 15:15 ` samu.p.onkalo 2011-01-17 15:28 ` Peter Zijlstra 2011-01-17 16:00 ` Peter Zijlstra 2011-01-18 8:23 ` Onkalo Samu 2011-01-18 8:59 ` Yong Zhang 2011-01-18 13:35 ` Peter Zijlstra 2011-01-18 14:25 ` Onkalo Samu 2011-01-19 2:38 ` Yong Zhang 2011-01-19 3:43 ` Mike Galbraith 2011-01-19 4:35 ` Yong Zhang 2011-01-19 5:40 ` Mike Galbraith 2011-01-19 6:09 ` Yong Zhang 2011-01-19 6:37 ` Mike Galbraith 2011-01-19 7:19 ` Ingo Molnar 2011-01-19 7:41 ` Mike Galbraith 2011-01-19 9:44 ` Peter Zijlstra 2011-01-19 10:38 ` Peter Zijlstra 2011-01-19 11:30 ` Peter Zijlstra 2011-01-19 12:58 ` Onkalo Samu 2011-01-19 13:13 ` Onkalo Samu 2011-01-19 13:30 ` Peter Zijlstra 2011-01-20 4:18 ` Yong Zhang 2011-01-20 4:27 ` Yong Zhang 2011-01-20 5:32 ` Yong Zhang 2011-01-20 4:59 ` Mike Galbraith 2011-01-20 5:30 ` Yong Zhang 2011-01-20 6:12 ` Mike Galbraith 2011-01-20 7:06 ` Yong Zhang 2011-01-20 8:37 ` Mike Galbraith 2011-01-20 9:07 ` Yong Zhang 2011-01-20 10:07 ` Mike Galbraith 2011-01-21 11:08 ` Peter Zijlstra 2011-01-21 12:24 ` Yong Zhang 2011-01-21 13:40 ` Peter Zijlstra 2011-01-21 15:03 ` Yong Zhang 2011-01-21 15:10 ` Peter Zijlstra 2011-01-21 13:15 ` Yong Zhang 2011-01-20 7:07 ` Onkalo Samu 2011-01-21 6:25 ` Onkalo Samu 2011-01-20 3:10 ` Yong Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox