* RT scheduling: wakeup bug?
@ 2007-10-01 22:15 Mike Kravetz
2007-10-02 5:06 ` -rt " Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Mike Kravetz @ 2007-10-01 22:15 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Linux Kernel Mailing List
I've been trying to track down some unexpected realtime latencies and
believe one source is a bug in the wakeup code. Specifically, this is
within the try_to_wake_up() routine. Within this routine there is the
following code segment:
/*
* If a newly woken up RT task cannot preempt the
* current (RT) task (on a target runqueue) then try
* to find another CPU it can preempt:
*/
if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
struct rq *this_rq = cpu_rq(this_cpu);
/*
* Special-case: the task on this CPU can be
* preempted. In that case there's no need to
* trigger reschedules on other CPUs, we can
* mark the current task for reschedule.
*
* (Note that it's safe to access this_rq without
* extra locking in this particular case, because
* we are on the current CPU.)
*/
if (TASK_PREEMPTS_CURR(p, this_rq))
set_tsk_need_resched(this_rq->curr);
else
/*
* Neither the intended target runqueue
* nor the current CPU can take this task.
* Trigger a reschedule on all other CPUs
* nevertheless, maybe one of them can take
* this task:
*/
smp_send_reschedule_allbutself_cpumask(p->cpus_allowed);
schedstat_inc(this_rq, rto_wakeup);
}
This logic seems appropriate. But, the task 'p' is most likely not on
the runqueue when sending the IPI. It gets added to the runqueue a
little later in the routine. As a result, the 'rt_overload' global may
not be set (based on the count of RT tasks on the runqueue) and other
CPUs may 'pass over' the runqueue when doing RT load balancing.
My observations/debugging/conclusions are based on an earlier version
of the code. It appears the same code/issue still exists in the most
version. But, I have not not done any work with the latest version.
--
Mike
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: -rt scheduling: wakeup bug? 2007-10-01 22:15 RT scheduling: wakeup bug? Mike Kravetz @ 2007-10-02 5:06 ` Ingo Molnar 2007-10-02 19:30 ` Mike Kravetz 2007-10-03 17:37 ` Mike Kravetz 0 siblings, 2 replies; 6+ messages in thread From: Ingo Molnar @ 2007-10-02 5:06 UTC (permalink / raw) To: Mike Kravetz Cc: Linux Kernel Mailing List, Thomas Gleixner, Steven Rostedt, Clark Williams hi Mike, * Mike Kravetz <kravetz@us.ibm.com> wrote: > I've been trying to track down some unexpected realtime latencies and > believe one source is a bug in the wakeup code. Specifically, this is > within the try_to_wake_up() routine. Within this routine there is the > following code segment: > > /* > * If a newly woken up RT task cannot preempt the > * current (RT) task (on a target runqueue) then try > * to find another CPU it can preempt: > */ > if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) { > struct rq *this_rq = cpu_rq(this_cpu); > /* > * Special-case: the task on this CPU can be > * preempted. In that case there's no need to > * trigger reschedules on other CPUs, we can > * mark the current task for reschedule. > * > * (Note that it's safe to access this_rq without > * extra locking in this particular case, because > * we are on the current CPU.) > */ > if (TASK_PREEMPTS_CURR(p, this_rq)) > set_tsk_need_resched(this_rq->curr); > else > /* > * Neither the intended target runqueue > * nor the current CPU can take this task. > * Trigger a reschedule on all other CPUs > * nevertheless, maybe one of them can take > * this task: > */ > smp_send_reschedule_allbutself_cpumask(p->cpus_allowed); > > schedstat_inc(this_rq, rto_wakeup); > } > > This logic seems appropriate. But, the task 'p' is most likely not on > the runqueue when sending the IPI. It gets added to the runqueue a > little later in the routine. As a result, the 'rt_overload' global > may not be set (based on the count of RT tasks on the runqueue) and > other CPUs may 'pass over' the runqueue when doing RT load balancing. > > My observations/debugging/conclusions are based on an earlier version > of the code. It appears the same code/issue still exists in the most > version. But, I have not not done any work with the latest version. I believe you are right - nice catch of this very nontrivial bug! The patch below is against .23-rc - do you think this fix (of moving the rt wakeup sequence to after the activate_task()) is adequate? Ingo Index: linux-rt-rebase.q/kernel/sched.c =================================================================== --- linux-rt-rebase.q.orig/kernel/sched.c +++ linux-rt-rebase.q/kernel/sched.c @@ -1819,6 +1819,13 @@ out_set_cpu: cpu = task_cpu(p); } +out_activate: +#endif /* CONFIG_SMP */ + + activate_task(rq, p, 1); + + trace_start_sched_wakeup(p, rq); + /* * If a newly woken up RT task cannot preempt the * current (RT) task (on a target runqueue) then try @@ -1849,28 +1856,21 @@ out_set_cpu: smp_send_reschedule_allbutself_cpumask(p->cpus_allowed); schedstat_inc(this_rq, rto_wakeup); - } - -out_activate: -#endif /* CONFIG_SMP */ - - activate_task(rq, p, 1); - - trace_start_sched_wakeup(p, rq); - - /* - * Sync wakeups (i.e. those types of wakeups where the waker - * has indicated that it will leave the CPU in short order) - * don't trigger a preemption, if the woken up task will run on - * this cpu. (in this case the 'I will reschedule' promise of - * the waker guarantees that the freshly woken up task is going - * to be considered on this CPU.) - */ - if (!sync || cpu != this_cpu) - check_preempt_curr(rq, p); - else { - if (TASK_PREEMPTS_CURR(p, rq)) - set_tsk_need_resched_delayed(rq->curr); + } else { + /* + * Sync wakeups (i.e. those types of wakeups where the waker + * has indicated that it will leave the CPU in short order) + * don't trigger a preemption, if the woken up task will run on + * this cpu. (in this case the 'I will reschedule' promise of + * the waker guarantees that the freshly woken up task is going + * to be considered on this CPU.) + */ + if (!sync || cpu != this_cpu) + check_preempt_curr(rq, p); + else { + if (TASK_PREEMPTS_CURR(p, rq)) + set_tsk_need_resched_delayed(rq->curr); + } } if (rq->curr && p && rq && _need_resched()) trace_special_pid(p->pid, PRIO(p), PRIO(rq->curr)); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: -rt scheduling: wakeup bug? 2007-10-02 5:06 ` -rt " Ingo Molnar @ 2007-10-02 19:30 ` Mike Kravetz 2007-10-02 19:37 ` Steven Rostedt 2007-10-03 17:37 ` Mike Kravetz 1 sibling, 1 reply; 6+ messages in thread From: Mike Kravetz @ 2007-10-02 19:30 UTC (permalink / raw) To: Ingo Molnar Cc: Linux Kernel Mailing List, Thomas Gleixner, Steven Rostedt, Clark Williams On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote: > * Mike Kravetz <kravetz@us.ibm.com> wrote: > > > > My observations/debugging/conclusions are based on an earlier version > > of the code. It appears the same code/issue still exists in the most > > version. But, I have not not done any work with the latest version. > > I believe you are right - nice catch of this very nontrivial bug! The > patch below is against .23-rc - do you think this fix (of moving the rt > wakeup sequence to after the activate_task()) is adequate? Yes, I have been running with a similar patch on a (much) earlier version of the code. It has helped quite a bit. I would have put together a patch for a later version, but my test environment is limited to this earlier version. -- Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: -rt scheduling: wakeup bug? 2007-10-02 19:30 ` Mike Kravetz @ 2007-10-02 19:37 ` Steven Rostedt 0 siblings, 0 replies; 6+ messages in thread From: Steven Rostedt @ 2007-10-02 19:37 UTC (permalink / raw) To: Mike Kravetz Cc: Ingo Molnar, Linux Kernel Mailing List, Thomas Gleixner, Clark Williams -- On Tue, 2 Oct 2007, Mike Kravetz wrote: > On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote: > > * Mike Kravetz <kravetz@us.ibm.com> wrote: > > > > > > My observations/debugging/conclusions are based on an earlier version > > > of the code. It appears the same code/issue still exists in the most > > > version. But, I have not not done any work with the latest version. > > > > I believe you are right - nice catch of this very nontrivial bug! The > > patch below is against .23-rc - do you think this fix (of moving the rt > > wakeup sequence to after the activate_task()) is adequate? > > Yes, I have been running with a similar patch on a (much) earlier > version of the code. It has helped quite a bit. I would have > put together a patch for a later version, but my test environment > is limited to this earlier version. FYI, I've incorporated Ingo's patch into the lastest -rt release. http://www.kernel.org/pub/linux/kernel/projects/rt/patch-2.6.23-rc9-rt1.bz2 -- Steve ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: -rt scheduling: wakeup bug? 2007-10-02 5:06 ` -rt " Ingo Molnar 2007-10-02 19:30 ` Mike Kravetz @ 2007-10-03 17:37 ` Mike Kravetz 2007-10-04 8:02 ` Ingo Molnar 1 sibling, 1 reply; 6+ messages in thread From: Mike Kravetz @ 2007-10-03 17:37 UTC (permalink / raw) To: Ingo Molnar Cc: Linux Kernel Mailing List, Thomas Gleixner, Steven Rostedt, Clark Williams On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote: > Index: linux-rt-rebase.q/kernel/sched.c > =================================================================== > --- linux-rt-rebase.q.orig/kernel/sched.c > +++ linux-rt-rebase.q/kernel/sched.c > @@ -1819,6 +1819,13 @@ out_set_cpu: > cpu = task_cpu(p); > } > > +out_activate: > +#endif /* CONFIG_SMP */ > + > + activate_task(rq, p, 1); > + > + trace_start_sched_wakeup(p, rq); > + > /* > * If a newly woken up RT task cannot preempt the > * current (RT) task (on a target runqueue) then try > @@ -1849,28 +1856,21 @@ out_set_cpu: > smp_send_reschedule_allbutself_cpumask(p->cpus_allowed); > > schedstat_inc(this_rq, rto_wakeup); > - } > - > -out_activate: > -#endif /* CONFIG_SMP */ > - > - activate_task(rq, p, 1); > - > - trace_start_sched_wakeup(p, rq); > - > - /* > - * Sync wakeups (i.e. those types of wakeups where the waker > - * has indicated that it will leave the CPU in short order) > - * don't trigger a preemption, if the woken up task will run on > - * this cpu. (in this case the 'I will reschedule' promise of > - * the waker guarantees that the freshly woken up task is going > - * to be considered on this CPU.) > - */ > - if (!sync || cpu != this_cpu) > - check_preempt_curr(rq, p); > - else { > - if (TASK_PREEMPTS_CURR(p, rq)) > - set_tsk_need_resched_delayed(rq->curr); > + } else { > + /* > + * Sync wakeups (i.e. those types of wakeups where the waker > + * has indicated that it will leave the CPU in short order) > + * don't trigger a preemption, if the woken up task will run on > + * this cpu. (in this case the 'I will reschedule' promise of > + * the waker guarantees that the freshly woken up task is going > + * to be considered on this CPU.) > + */ > + if (!sync || cpu != this_cpu) > + check_preempt_curr(rq, p); > + else { > + if (TASK_PREEMPTS_CURR(p, rq)) > + set_tsk_need_resched_delayed(rq->curr); > + } > } > if (rq->curr && p && rq && _need_resched()) > trace_special_pid(p->pid, PRIO(p), PRIO(rq->curr)); Not an issue with the patch, just that last bit of code pulled in for context. I don't think it is a bug, but the checking of 'rq' after checking 'rq->curr' just doesn't look right (or necessary). Could it just be an artifact from earlier versions of the code? -- Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: -rt scheduling: wakeup bug? 2007-10-03 17:37 ` Mike Kravetz @ 2007-10-04 8:02 ` Ingo Molnar 0 siblings, 0 replies; 6+ messages in thread From: Ingo Molnar @ 2007-10-04 8:02 UTC (permalink / raw) To: Mike Kravetz Cc: Linux Kernel Mailing List, Thomas Gleixner, Steven Rostedt, Clark Williams * Mike Kravetz <kravetz@us.ibm.com> wrote: > > if (rq->curr && p && rq && _need_resched()) > > trace_special_pid(p->pid, PRIO(p), PRIO(rq->curr)); > > Not an issue with the patch, just that last bit of code pulled in for > context. I don't think it is a bug, but the checking of 'rq' after > checking 'rq->curr' just doesn't look right (or necessary). Could it > just be an artifact from earlier versions of the code? yeah, you are right - and rq shouldnt ever be NULL there anyway. Ingo ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-10-04 8:03 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-10-01 22:15 RT scheduling: wakeup bug? Mike Kravetz 2007-10-02 5:06 ` -rt " Ingo Molnar 2007-10-02 19:30 ` Mike Kravetz 2007-10-02 19:37 ` Steven Rostedt 2007-10-03 17:37 ` Mike Kravetz 2007-10-04 8:02 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox