* [PATCH] Make sure timers have migrated before killing migration_thread @ 2010-05-19 9:05 Amit K. Arora 2010-05-19 9:31 ` Peter Zijlstra 0 siblings, 1 reply; 19+ messages in thread From: Amit K. Arora @ 2010-05-19 9:05 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel Problem : In a stress test where some heavy tests were running along with regular CPU offlining and onlining, a hang was observed. The system seems to be hung at a point where migration_call() tries to kill the migration_thread of the dying CPU, which just got moved to the current CPU. This migration thread does not get a chance to run (and die) since rt_throttled is set to 1 on current, and it doesn't get cleared as the hrtimer which is supposed to reset the rt bandwidth (sched_rt_period_timer) is tied to the CPU being offlined. Solution : This patch pushes the killing of migration thread to "CPU_POST_DEAD" event. By then all the timers (including sched_rt_period_timer) should have got migrated (along with other callbacks). Alternate Solution considered : Another option considered was to increase the priority of the hrtimer cpu offline notifier, such that it gets to run before scheduler's migration cpu offline notifier. In this way we are sure that the timers will get migrated before migration_call tries to kill migration_thread. But, this can have some non-obvious implications, suggested Srivatsa. Testing : Without the patch the stress tests didn't last for even 12 hours. And yes, the problem was reproducible. With the patch applied the tests ran successfully for more than 48 hours. Thanks! -- Regards, Amit Arora Signed-off-by: Amit Arora <aarora@in.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> -- diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 +++ linux-2.6.34/kernel/sched.c 2010-05-18 22:58:31.000000000 -0700 @@ -5942,14 +5942,26 @@ migration_call(struct notifier_block *nf cpu_rq(cpu)->migration_thread = NULL; break; + case CPU_POST_DEAD: + /* + Bring the migration thread down in CPU_POST_DEAD event, + since the timers should have got migrated by now and thus + we should not see a deadlock between trying to kill the + migration thread and the sched_rt_period_timer. + */ + cpuset_lock(); + rq = cpu_rq(cpu); + kthread_stop(rq->migration_thread); + put_task_struct(rq->migration_thread); + rq->migration_thread = NULL; + cpuset_unlock(); + break; + case CPU_DEAD: case CPU_DEAD_FROZEN: cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */ migrate_live_tasks(cpu); rq = cpu_rq(cpu); - kthread_stop(rq->migration_thread); - put_task_struct(rq->migration_thread); - rq->migration_thread = NULL; /* Idle task back to normal (off runqueue, low prio) */ raw_spin_lock_irq(&rq->lock); update_rq_clock(rq); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-19 9:05 [PATCH] Make sure timers have migrated before killing migration_thread Amit K. Arora @ 2010-05-19 9:31 ` Peter Zijlstra 2010-05-19 12:13 ` [PATCH v2] " Amit K. Arora 2010-05-24 9:59 ` [PATCH] " Amit K. Arora 0 siblings, 2 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-19 9:31 UTC (permalink / raw) To: Amit K. Arora Cc: Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Wed, 2010-05-19 at 14:35 +0530, Amit K. Arora wrote: > Problem : In a stress test where some heavy tests were running along with > regular CPU offlining and onlining, a hang was observed. The system seems to > be hung at a point where migration_call() tries to kill the migration_thread > of the dying CPU, which just got moved to the current CPU. This migration > thread does not get a chance to run (and die) since rt_throttled is set to 1 > on current, and it doesn't get cleared as the hrtimer which is supposed to > reset the rt bandwidth (sched_rt_period_timer) is tied to the CPU being > offlined. > > Solution : This patch pushes the killing of migration thread to "CPU_POST_DEAD" > event. By then all the timers (including sched_rt_period_timer) should have got > migrated (along with other callbacks). > > Alternate Solution considered : Another option considered was to > increase the priority of the hrtimer cpu offline notifier, such that it > gets to run before scheduler's migration cpu offline notifier. In this > way we are sure that the timers will get migrated before migration_call > tries to kill migration_thread. But, this can have some non-obvious > implications, suggested Srivatsa. > > Testing : Without the patch the stress tests didn't last for even 12 > hours. And yes, the problem was reproducible. With the patch applied the > tests ran successfully for more than 48 hours. > > Thanks! > -- > Regards, > Amit Arora > > Signed-off-by: Amit Arora <aarora@in.ibm.com> > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > -- > diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c > --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 > +++ linux-2.6.34/kernel/sched.c 2010-05-18 22:58:31.000000000 -0700 > @@ -5942,14 +5942,26 @@ migration_call(struct notifier_block *nf > cpu_rq(cpu)->migration_thread = NULL; > break; > > + case CPU_POST_DEAD: > + /* > + Bring the migration thread down in CPU_POST_DEAD event, > + since the timers should have got migrated by now and thus > + we should not see a deadlock between trying to kill the > + migration thread and the sched_rt_period_timer. > + */ Faulty comment style that, please use: /* * text * goes * here */ > + cpuset_lock(); > + rq = cpu_rq(cpu); > + kthread_stop(rq->migration_thread); > + put_task_struct(rq->migration_thread); > + rq->migration_thread = NULL; > + cpuset_unlock(); > + break; > + The other problem is more urgent though, CPU_POST_DEAD runs outside of the hotplug lock and thus the above becomes a race where we could possible kill off the migration thread of a newly brought up cpu: cpu0 - down 2 cpu1 - up 2 (allocs a new migration thread, and leaks the old one) cpu0 - post_down 2 - frees the migration thread -- oops! ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-19 9:31 ` Peter Zijlstra @ 2010-05-19 12:13 ` Amit K. Arora 2010-05-20 7:28 ` Peter Zijlstra 2010-05-24 9:59 ` [PATCH] " Amit K. Arora 1 sibling, 1 reply; 19+ messages in thread From: Amit K. Arora @ 2010-05-19 12:13 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > On Wed, 2010-05-19 at 14:35 +0530, Amit K. Arora wrote: Hi Peter, Thanks for the review! > > diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c > > --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 > > +++ linux-2.6.34/kernel/sched.c 2010-05-18 22:58:31.000000000 -0700 > > @@ -5942,14 +5942,26 @@ migration_call(struct notifier_block *nf > > cpu_rq(cpu)->migration_thread = NULL; > > break; > > > > + case CPU_POST_DEAD: > > + /* > > + Bring the migration thread down in CPU_POST_DEAD event, > > + since the timers should have got migrated by now and thus > > + we should not see a deadlock between trying to kill the > > + migration thread and the sched_rt_period_timer. > > + */ > > Faulty comment style that, please use: > > /* > * text > * goes > * here > */ Sure. > > + cpuset_lock(); > > + rq = cpu_rq(cpu); > > + kthread_stop(rq->migration_thread); > > + put_task_struct(rq->migration_thread); > > + rq->migration_thread = NULL; > > + cpuset_unlock(); > > + break; > > + > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > the hotplug lock and thus the above becomes a race where we could > possible kill off the migration thread of a newly brought up cpu: > > cpu0 - down 2 > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > cpu0 - post_down 2 - frees the migration thread -- oops! Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? The cpuset_lock will synchronize, and thus avoid race between killing of migration_thread in up_prepare and post_dead events. Here is the updated patch. If you don't like this one too, do you mind suggesting an alternate approach to tackle the problem ? Thanks ! -- Regards, Amit Arora Signed-off-by: Amit Arora <aarora@in.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> -- diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 +++ linux-2.6.34/kernel/sched.c 2010-05-19 04:47:49.000000000 -0700 @@ -5900,6 +5900,19 @@ migration_call(struct notifier_block *nf case CPU_UP_PREPARE: case CPU_UP_PREPARE_FROZEN: + cpuset_lock(); + rq = cpu_rq(cpu); + /* + * Since we now kill migration_thread in CPU_POST_DEAD event, + * there may be a race here. So, lets cleanup the old + * migration_thread on the rq, if any. + */ + if (unlikely(rq->migration_thread)) { + kthread_stop(rq->migration_thread); + put_task_struct(rq->migration_thread); + rq->migration_thread = NULL; + } + cpuset_unlock(); p = kthread_create(migration_thread, hcpu, "migration/%d", cpu); if (IS_ERR(p)) return NOTIFY_BAD; @@ -5942,14 +5955,34 @@ migration_call(struct notifier_block *nf cpu_rq(cpu)->migration_thread = NULL; break; + case CPU_POST_DEAD: + /* + * Bring the migration thread down in CPU_POST_DEAD event, + * since the timers should have got migrated by now and thus + * we should not see a deadlock between trying to kill the + * migration thread and the sched_rt_period_timer. + */ + cpuset_lock(); + rq = cpu_rq(cpu); + if (likely(rq->migration_thread)) { + /* + * Its possible that this CPU was onlined (from a + * different CPU) before we reached here and + * migration_thread was cleaned-up in the + * CPU_UP_PREPARE event handling. + */ + kthread_stop(rq->migration_thread); + put_task_struct(rq->migration_thread); + rq->migration_thread = NULL; + } + cpuset_unlock(); + break; + case CPU_DEAD: case CPU_DEAD_FROZEN: cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */ migrate_live_tasks(cpu); rq = cpu_rq(cpu); - kthread_stop(rq->migration_thread); - put_task_struct(rq->migration_thread); - rq->migration_thread = NULL; /* Idle task back to normal (off runqueue, low prio) */ raw_spin_lock_irq(&rq->lock); update_rq_clock(rq); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-19 12:13 ` [PATCH v2] " Amit K. Arora @ 2010-05-20 7:28 ` Peter Zijlstra 2010-05-23 9:07 ` Mike Galbraith ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-20 7:28 UTC (permalink / raw) To: Amit K. Arora Cc: Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel, Thomas Gleixner On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote: > Alternate Solution considered : Another option considered was to > increase the priority of the hrtimer cpu offline notifier, such that it > gets to run before scheduler's migration cpu offline notifier. In this > way we are sure that the timers will get migrated before migration_call > tries to kill migration_thread. But, this can have some non-obvious > implications, suggested Srivatsa. > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > > the hotplug lock and thus the above becomes a race where we could > > possible kill off the migration thread of a newly brought up cpu: > > > > cpu0 - down 2 > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > > cpu0 - post_down 2 - frees the migration thread -- oops! > > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? > The cpuset_lock will synchronize, and thus avoid race between killing of > migration_thread in up_prepare and post_dead events. > > Here is the updated patch. If you don't like this one too, do you mind > suggesting an alternate approach to tackle the problem ? Thanks ! Right, so this isn't pretty at all.. Ingo, the comment near the migration_notifier says that migration_call should happen before all else, but can you see anything that would break if we let the timer migration happen first? Thomas? > Signed-off-by: Amit Arora <aarora@in.ibm.com> > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > -- For everybody who reads this, _please_ use three (3) dashes '-' to separate the changelog from the patch, and left-align the changelog (including all tags). I seem to get more and more people sending patches with 2 dashes and daft changelogs with whitespace stuffing which break my scripts. > diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c > --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 > +++ linux-2.6.34/kernel/sched.c 2010-05-19 04:47:49.000000000 -0700 > @@ -5900,6 +5900,19 @@ migration_call(struct notifier_block *nf > > case CPU_UP_PREPARE: > case CPU_UP_PREPARE_FROZEN: > + cpuset_lock(); > + rq = cpu_rq(cpu); > + /* > + * Since we now kill migration_thread in CPU_POST_DEAD event, > + * there may be a race here. So, lets cleanup the old > + * migration_thread on the rq, if any. > + */ > + if (unlikely(rq->migration_thread)) { > + kthread_stop(rq->migration_thread); > + put_task_struct(rq->migration_thread); > + rq->migration_thread = NULL; > + } > + cpuset_unlock(); > p = kthread_create(migration_thread, hcpu, "migration/%d", cpu); > if (IS_ERR(p)) > return NOTIFY_BAD; > @@ -5942,14 +5955,34 @@ migration_call(struct notifier_block *nf > cpu_rq(cpu)->migration_thread = NULL; > break; > > + case CPU_POST_DEAD: > + /* > + * Bring the migration thread down in CPU_POST_DEAD event, > + * since the timers should have got migrated by now and thus > + * we should not see a deadlock between trying to kill the > + * migration thread and the sched_rt_period_timer. > + */ > + cpuset_lock(); > + rq = cpu_rq(cpu); > + if (likely(rq->migration_thread)) { > + /* > + * Its possible that this CPU was onlined (from a > + * different CPU) before we reached here and > + * migration_thread was cleaned-up in the > + * CPU_UP_PREPARE event handling. > + */ > + kthread_stop(rq->migration_thread); > + put_task_struct(rq->migration_thread); > + rq->migration_thread = NULL; > + } > + cpuset_unlock(); > + break; > + > case CPU_DEAD: > case CPU_DEAD_FROZEN: > cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */ > migrate_live_tasks(cpu); > rq = cpu_rq(cpu); > - kthread_stop(rq->migration_thread); > - put_task_struct(rq->migration_thread); > - rq->migration_thread = NULL; > /* Idle task back to normal (off runqueue, low prio) */ > raw_spin_lock_irq(&rq->lock); > update_rq_clock(rq); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-20 7:28 ` Peter Zijlstra @ 2010-05-23 9:07 ` Mike Galbraith 2010-05-23 9:13 ` Peter Zijlstra 2010-05-24 6:43 ` Amit K. Arora 2010-05-25 20:19 ` Thomas Gleixner 2 siblings, 1 reply; 19+ messages in thread From: Mike Galbraith @ 2010-05-23 9:07 UTC (permalink / raw) To: Peter Zijlstra Cc: Amit K. Arora, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel, Thomas Gleixner On Thu, 2010-05-20 at 09:28 +0200, Peter Zijlstra wrote: > On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote: > > Alternate Solution considered : Another option considered was to > > increase the priority of the hrtimer cpu offline notifier, such that it > > gets to run before scheduler's migration cpu offline notifier. In this > > way we are sure that the timers will get migrated before migration_call > > tries to kill migration_thread. But, this can have some non-obvious > > implications, suggested Srivatsa. > > > > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > > > the hotplug lock and thus the above becomes a race where we could > > > possible kill off the migration thread of a newly brought up cpu: > > > > > > cpu0 - down 2 > > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > > > cpu0 - post_down 2 - frees the migration thread -- oops! > > > > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? > > The cpuset_lock will synchronize, and thus avoid race between killing of > > migration_thread in up_prepare and post_dead events. > > > > Here is the updated patch. If you don't like this one too, do you mind > > suggesting an alternate approach to tackle the problem ? Thanks ! > > Right, so this isn't pretty at all.. Since the problem seems to stem from interfering with a critical thread, how about create a SCHED_SYSTEM_CRITICAL flag ala SCHED_RESET_ON_FORK? Not particularly beautiful, and completely untested (well, it compiles). diff --git a/include/linux/sched.h b/include/linux/sched.h index 6cc43e0..23da2cf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -40,6 +40,8 @@ #define SCHED_IDLE 5 /* Can be ORed in to make sure the process is reverted back to SCHED_NORMAL on fork */ #define SCHED_RESET_ON_FORK 0x40000000 +/* Can be ORed in to flag a thread as being system critical. Not inherited. */ +#define SCHED_SYSTEM_CRITICAL 0x20000000 #ifdef __KERNEL__ @@ -1240,6 +1242,9 @@ struct task_struct { /* Revert to default priority/policy when forking */ unsigned sched_reset_on_fork:1; + /* System critical thread. Cleared on fork. */ + unsigned sched_system_critical:1; + pid_t pid; pid_t tgid; diff --git a/kernel/sched.c b/kernel/sched.c index 2d17e3b..caf8b95 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2511,6 +2511,11 @@ void sched_fork(struct task_struct *p, int clone_flags) */ p->prio = current->normal_prio; + /* + * System critical policy flag is not inherited. + */ + p->sched_system_critical = 0; + if (!rt_prio(p->prio)) p->sched_class = &fair_sched_class; @@ -4486,7 +4491,7 @@ static int __sched_setscheduler(struct task_struct *p, int policy, unsigned long flags; const struct sched_class *prev_class; struct rq *rq; - int reset_on_fork; + int reset_on_fork, system_critical; /* may grab non-irq protected spin_locks */ BUG_ON(in_interrupt()); @@ -4494,10 +4499,12 @@ recheck: /* double check policy once rq lock held */ if (policy < 0) { reset_on_fork = p->sched_reset_on_fork; + system_critical = p->sched_system_critical; policy = oldpolicy = p->policy; } else { reset_on_fork = !!(policy & SCHED_RESET_ON_FORK); - policy &= ~SCHED_RESET_ON_FORK; + system_critical = !!(policy & SCHED_SYSTEM_CRITICAL); + policy &= ~(SCHED_RESET_ON_FORK|SCHED_SYSTEM_CRITICAL); if (policy != SCHED_FIFO && policy != SCHED_RR && policy != SCHED_NORMAL && policy != SCHED_BATCH && @@ -4552,6 +4559,10 @@ recheck: /* Normal users shall not reset the sched_reset_on_fork flag */ if (p->sched_reset_on_fork && !reset_on_fork) return -EPERM; + + /* Normal users shall not reset the sched_system_critical flag */ + if (p->sched_system_critical && !system_critical) + return -EPERM; } if (user) { @@ -4560,7 +4571,7 @@ recheck: * Do not allow realtime tasks into groups that have no runtime * assigned. */ - if (rt_bandwidth_enabled() && rt_policy(policy) && + if (rt_bandwidth_enabled() && rt_policy(policy) && !system_critical && task_group(p)->rt_bandwidth.rt_runtime == 0) return -EPERM; #endif @@ -4595,6 +4606,7 @@ recheck: p->sched_class->put_prev_task(rq, p); p->sched_reset_on_fork = reset_on_fork; + p->sched_system_critical = system_critical; oldprio = p->prio; prev_class = p->sched_class; @@ -4712,9 +4724,13 @@ SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid) p = find_process_by_pid(pid); if (p) { retval = security_task_getscheduler(p); - if (!retval) - retval = p->policy - | (p->sched_reset_on_fork ? SCHED_RESET_ON_FORK : 0); + if (!retval) { + retval = p->policy; + if (p->sched_reset_on_fork) + retval |= SCHED_RESET_ON_FORK; + if (p->sched_system_critical) + retval |= SCHED_SYSTEM_CRITICAL; + } } rcu_read_unlock(); return retval; diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 8afb953..4fb5ced 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -605,6 +605,7 @@ static void update_curr_rt(struct rq *rq) struct sched_rt_entity *rt_se = &curr->rt; struct rt_rq *rt_rq = rt_rq_of_se(rt_se); u64 delta_exec; + int system_critical = curr->sched_system_critical; if (!task_has_rt_policy(curr)) return; @@ -621,9 +622,13 @@ static void update_curr_rt(struct rq *rq) curr->se.exec_start = rq->clock; cpuacct_charge(curr, delta_exec); - sched_rt_avg_update(rq, delta_exec); + /* + * System critical tasks do not contribute to bandwidth consumption, + * nor are they evicted when runtime is exceeded. + */ + sched_rt_avg_update(rq, system_critical ? 0 : delta_exec); - if (!rt_bandwidth_enabled()) + if (!rt_bandwidth_enabled() || system_critical) return; for_each_sched_rt_entity(rt_se) { diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index b4e7431..6cbea9a 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -304,7 +304,7 @@ static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb, cpu); if (IS_ERR(p)) return NOTIFY_BAD; - sched_setscheduler_nocheck(p, SCHED_FIFO, ¶m); + sched_setscheduler_nocheck(p, SCHED_FIFO|SCHED_SYSTEM_CRITICAL, ¶m); get_task_struct(p); stopper->thread = p; break; ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-23 9:07 ` Mike Galbraith @ 2010-05-23 9:13 ` Peter Zijlstra 0 siblings, 0 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-23 9:13 UTC (permalink / raw) To: Mike Galbraith Cc: Amit K. Arora, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel, Thomas Gleixner On Sun, 2010-05-23 at 11:07 +0200, Mike Galbraith wrote: > On Thu, 2010-05-20 at 09:28 +0200, Peter Zijlstra wrote: > > On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote: > > > Alternate Solution considered : Another option considered was to > > > increase the priority of the hrtimer cpu offline notifier, such that it > > > gets to run before scheduler's migration cpu offline notifier. In this > > > way we are sure that the timers will get migrated before migration_call > > > tries to kill migration_thread. But, this can have some non-obvious > > > implications, suggested Srivatsa. > > > > > > > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > > > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > > > > the hotplug lock and thus the above becomes a race where we could > > > > possible kill off the migration thread of a newly brought up cpu: > > > > > > > > cpu0 - down 2 > > > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > > > > cpu0 - post_down 2 - frees the migration thread -- oops! > > > > > > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? > > > The cpuset_lock will synchronize, and thus avoid race between killing of > > > migration_thread in up_prepare and post_dead events. > > > > > > Here is the updated patch. If you don't like this one too, do you mind > > > suggesting an alternate approach to tackle the problem ? Thanks ! > > > > Right, so this isn't pretty at all.. > > Since the problem seems to stem from interfering with a critical thread, > how about create a SCHED_SYSTEM_CRITICAL flag ala SCHED_RESET_ON_FORK? > > Not particularly beautiful, and completely untested (well, it compiles). Nah, I'd rather we pull the migration thread out of SCHED_FIFO and either schedule it explicit or add a sched_class for it. We need to do that anyway once we go play with SCHED_DEADLINE. But it would be very nice if we could simply order the timer and task migration bits so that the whole problem doesn't exist in the first place. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-20 7:28 ` Peter Zijlstra 2010-05-23 9:07 ` Mike Galbraith @ 2010-05-24 6:43 ` Amit K. Arora 2010-05-25 20:19 ` Thomas Gleixner 2 siblings, 0 replies; 19+ messages in thread From: Amit K. Arora @ 2010-05-24 6:43 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner Cc: Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Thu, May 20, 2010 at 09:28:03AM +0200, Peter Zijlstra wrote: > On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote: > > Alternate Solution considered : Another option considered was to > > increase the priority of the hrtimer cpu offline notifier, such that it > > gets to run before scheduler's migration cpu offline notifier. In this > > way we are sure that the timers will get migrated before migration_call > > tries to kill migration_thread. But, this can have some non-obvious > > implications, suggested Srivatsa. > > > > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > > > the hotplug lock and thus the above becomes a race where we could > > > possible kill off the migration thread of a newly brought up cpu: > > > > > > cpu0 - down 2 > > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > > > cpu0 - post_down 2 - frees the migration thread -- oops! > > > > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? > > The cpuset_lock will synchronize, and thus avoid race between killing of > > migration_thread in up_prepare and post_dead events. > > > > Here is the updated patch. If you don't like this one too, do you mind > > suggesting an alternate approach to tackle the problem ? Thanks ! > > Right, so this isn't pretty at all.. > > Ingo, the comment near the migration_notifier says that migration_call > should happen before all else, but can you see anything that would break > if we let the timer migration happen first? > > Thomas? Hello Ingo, Thomas, Do you see any potential problems with migrating the timers before the migration_call ? Thanks! -- Regards, Amit Arora ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-20 7:28 ` Peter Zijlstra 2010-05-23 9:07 ` Mike Galbraith 2010-05-24 6:43 ` Amit K. Arora @ 2010-05-25 20:19 ` Thomas Gleixner 2010-05-26 6:43 ` Peter Zijlstra 2 siblings, 1 reply; 19+ messages in thread From: Thomas Gleixner @ 2010-05-25 20:19 UTC (permalink / raw) To: Peter Zijlstra Cc: Amit K. Arora, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Thu, 20 May 2010, Peter Zijlstra wrote: > On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote: > > Alternate Solution considered : Another option considered was to > > increase the priority of the hrtimer cpu offline notifier, such that it > > gets to run before scheduler's migration cpu offline notifier. In this > > way we are sure that the timers will get migrated before migration_call > > tries to kill migration_thread. But, this can have some non-obvious > > implications, suggested Srivatsa. > > > > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > > > the hotplug lock and thus the above becomes a race where we could > > > possible kill off the migration thread of a newly brought up cpu: > > > > > > cpu0 - down 2 > > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > > > cpu0 - post_down 2 - frees the migration thread -- oops! > > > > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? > > The cpuset_lock will synchronize, and thus avoid race between killing of > > migration_thread in up_prepare and post_dead events. > > > > Here is the updated patch. If you don't like this one too, do you mind > > suggesting an alternate approach to tackle the problem ? Thanks ! > > Right, so this isn't pretty at all.. > > Ingo, the comment near the migration_notifier says that migration_call > should happen before all else, but can you see anything that would break > if we let the timer migration happen first? > > Thomas? That should work, though what is killing the scheduler per rq hrtimers _before_ we migrate stuff ? We don't want to migrate them, right ? Thanks, tglx ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] Make sure timers have migrated before killing migration_thread 2010-05-25 20:19 ` Thomas Gleixner @ 2010-05-26 6:43 ` Peter Zijlstra 0 siblings, 0 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-26 6:43 UTC (permalink / raw) To: Thomas Gleixner Cc: Amit K. Arora, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Tue, 2010-05-25 at 22:19 +0200, Thomas Gleixner wrote: > On Thu, 20 May 2010, Peter Zijlstra wrote: > > > On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote: > > > Alternate Solution considered : Another option considered was to > > > increase the priority of the hrtimer cpu offline notifier, such that it > > > gets to run before scheduler's migration cpu offline notifier. In this > > > way we are sure that the timers will get migrated before migration_call > > > tries to kill migration_thread. But, this can have some non-obvious > > > implications, suggested Srivatsa. > > > > > > > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > > > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > > > > the hotplug lock and thus the above becomes a race where we could > > > > possible kill off the migration thread of a newly brought up cpu: > > > > > > > > cpu0 - down 2 > > > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > > > > cpu0 - post_down 2 - frees the migration thread -- oops! > > > > > > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ? > > > The cpuset_lock will synchronize, and thus avoid race between killing of > > > migration_thread in up_prepare and post_dead events. > > > > > > Here is the updated patch. If you don't like this one too, do you mind > > > suggesting an alternate approach to tackle the problem ? Thanks ! > > > > Right, so this isn't pretty at all.. > > > > Ingo, the comment near the migration_notifier says that migration_call > > should happen before all else, but can you see anything that would break > > if we let the timer migration happen first? > > > > Thomas? > > That should work, though what is killing the scheduler per rq hrtimers > _before_ we migrate stuff ? We don't want to migrate them, right ? They're not rq timers, they're the 'cgroup' bandwidth timers and those are free to migrate. What I think happens is that the timer ends up being on the cpu that goes down, then we disable IRQs on it and run out of bandwidth and get stuck. Anyway, we solved it with a one-liner in a different way. Eventually I'll rip the whole migration thread thingy out of SCHED_FIFO, which too should solve the issue I think. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-19 9:31 ` Peter Zijlstra 2010-05-19 12:13 ` [PATCH v2] " Amit K. Arora @ 2010-05-24 9:59 ` Amit K. Arora 2010-05-24 13:28 ` Peter Zijlstra 2010-05-25 11:31 ` Peter Zijlstra 1 sibling, 2 replies; 19+ messages in thread From: Amit K. Arora @ 2010-05-24 9:59 UTC (permalink / raw) To: Peter Zijlstra Cc: tj, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote: > On Wed, 2010-05-19 at 14:35 +0530, Amit K. Arora wrote: > > + cpuset_lock(); > > + rq = cpu_rq(cpu); > > + kthread_stop(rq->migration_thread); > > + put_task_struct(rq->migration_thread); > > + rq->migration_thread = NULL; > > + cpuset_unlock(); > > + break; > > + > > The other problem is more urgent though, CPU_POST_DEAD runs outside of > the hotplug lock and thus the above becomes a race where we could > possible kill off the migration thread of a newly brought up cpu: > > cpu0 - down 2 > cpu1 - up 2 (allocs a new migration thread, and leaks the old one) > cpu0 - post_down 2 - frees the migration thread -- oops! <Adding Tejun Heo to CC list .. > Hi Peter, In an offline discussion with Tejun, he suggested that the above race can not happen, since _cpu_up() and _cpu_down() can never run in parallel, because of cpu_add_remove_lock. Looking at the code we can see that cpu_up() and cpu_down() call "_" variants with cpu_add_remove_lock mutex held (using cpu_maps_update_begin()). Here is exactly what he had to say: "I don't think that's possible. There are two locks involved here. cpu_add_remove_lock and cpu_hotplug.lock. The former wraps around the second and already provides full exclusion between all cpu hotplug/unplug operations. The latter is there for reader/writer type exclusion via get/put_online_cpus(). CPU_POST_DEAD is outside of cpu_hotplug.lock allowing get_online_cpus() to proceed in parallel but it's still inside cpu_add_remove_lock so other cpu up/down operations cannot begin before it finishes. " Thus, since above race can never happen, is there any other issue with this patch ? Thanks! -- Regards, Amit Arora Signed-off-by: Amit Arora <aarora@in.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> --- diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 +++ linux-2.6.34/kernel/sched.c 2010-05-18 22:58:31.000000000 -0700 @@ -5942,14 +5942,26 @@ migration_call(struct notifier_block *nf cpu_rq(cpu)->migration_thread = NULL; break; + case CPU_POST_DEAD: + /* + * Bring the migration thread down in CPU_POST_DEAD event, + * since the timers should have got migrated by now and thus + * we should not see a deadlock between trying to kill the + * migration thread and the sched_rt_period_timer. + */ + cpuset_lock(); + rq = cpu_rq(cpu); + kthread_stop(rq->migration_thread); + put_task_struct(rq->migration_thread); + rq->migration_thread = NULL; + cpuset_unlock(); + break; + case CPU_DEAD: case CPU_DEAD_FROZEN: cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */ migrate_live_tasks(cpu); rq = cpu_rq(cpu); - kthread_stop(rq->migration_thread); - put_task_struct(rq->migration_thread); - rq->migration_thread = NULL; /* Idle task back to normal (off runqueue, low prio) */ raw_spin_lock_irq(&rq->lock); update_rq_clock(rq); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-24 9:59 ` [PATCH] " Amit K. Arora @ 2010-05-24 13:28 ` Peter Zijlstra 2010-05-24 15:16 ` Srivatsa Vaddagiri 2010-05-25 11:31 ` Peter Zijlstra 1 sibling, 1 reply; 19+ messages in thread From: Peter Zijlstra @ 2010-05-24 13:28 UTC (permalink / raw) To: Amit K. Arora Cc: tj, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote: > since _cpu_up() and _cpu_down() can never run in > parallel, because of cpu_add_remove_lock. Ah indeed. I guess your initial patch works then. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-24 13:28 ` Peter Zijlstra @ 2010-05-24 15:16 ` Srivatsa Vaddagiri 2010-05-24 15:55 ` Peter Zijlstra 0 siblings, 1 reply; 19+ messages in thread From: Srivatsa Vaddagiri @ 2010-05-24 15:16 UTC (permalink / raw) To: Peter Zijlstra Cc: Amit K. Arora, tj, Ingo Molnar, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Mon, May 24, 2010 at 03:28:45PM +0200, Peter Zijlstra wrote: > On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote: > > since _cpu_up() and _cpu_down() can never run in > > parallel, because of cpu_add_remove_lock. > > Ah indeed. I guess your initial patch works then. One thing I found surprising was that a cpu's rt-bandwidth renewal could be dependant on another cpu's (rt-bandwidth) timer firing ontime. In this case, we had migration/23 pulled over to CPU0 and we hung later waiting for migration/23 to exit. migration/23 was not exiting because it could not run on CPU0 (as CPU0's rt-bandwidth had expired). This situation remained forever. I would have expected CPU0's bandwidth to have been renewed independent of some timer on CPU23 to fire - maybe I am missing something not obvious in the code? - vatsa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-24 15:16 ` Srivatsa Vaddagiri @ 2010-05-24 15:55 ` Peter Zijlstra 0 siblings, 0 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-24 15:55 UTC (permalink / raw) To: vatsa Cc: Amit K. Arora, tj, Ingo Molnar, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Mon, 2010-05-24 at 20:46 +0530, Srivatsa Vaddagiri wrote: > On Mon, May 24, 2010 at 03:28:45PM +0200, Peter Zijlstra wrote: > > On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote: > > > since _cpu_up() and _cpu_down() can never run in > > > parallel, because of cpu_add_remove_lock. > > > > Ah indeed. I guess your initial patch works then. > > One thing I found surprising was that a cpu's rt-bandwidth renewal could be > dependant on another cpu's (rt-bandwidth) timer firing ontime. In this case, we > had migration/23 pulled over to CPU0 and we hung later waiting for migration/23 > to exit. migration/23 was not exiting because it could not run on CPU0 (as > CPU0's rt-bandwidth had expired). This situation remained forever. I would have > expected CPU0's bandwidth to have been renewed independent of some timer on > CPU23 to fire - maybe I am missing something not obvious in the code? The bandwidth constraint is per cgroup, and cgroups span cpus. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-24 9:59 ` [PATCH] " Amit K. Arora 2010-05-24 13:28 ` Peter Zijlstra @ 2010-05-25 11:31 ` Peter Zijlstra 2010-05-25 12:10 ` Amit K. Arora 2010-05-25 13:23 ` [PATCH v3] " Amit K. Arora 1 sibling, 2 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-25 11:31 UTC (permalink / raw) To: Amit K. Arora Cc: tj, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote: > > Thus, since above race can never happen, is there any other issue with > this patch ? It doesn't seem to apply nicely... > > Signed-off-by: Amit Arora <aarora@in.ibm.com> > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > --- > diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c > --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700 > +++ linux-2.6.34/kernel/sched.c 2010-05-18 22:58:31.000000000 -0700 > @@ -5942,14 +5942,26 @@ migration_call(struct notifier_block *nf > cpu_rq(cpu)->migration_thread = NULL; > break; > > + case CPU_POST_DEAD: > + /* > + * Bring the migration thread down in CPU_POST_DEAD event, > + * since the timers should have got migrated by now and thus > + * we should not see a deadlock between trying to kill the > + * migration thread and the sched_rt_period_timer. > + */ > + cpuset_lock(); > + rq = cpu_rq(cpu); > + kthread_stop(rq->migration_thread); > + put_task_struct(rq->migration_thread); > + rq->migration_thread = NULL; > + cpuset_unlock(); > + break; > + > case CPU_DEAD: > case CPU_DEAD_FROZEN: > cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */ > migrate_live_tasks(cpu); > rq = cpu_rq(cpu); > - kthread_stop(rq->migration_thread); > - put_task_struct(rq->migration_thread); > - rq->migration_thread = NULL; > /* Idle task back to normal (off runqueue, low prio) */ > raw_spin_lock_irq(&rq->lock); > update_rq_clock(rq); > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Make sure timers have migrated before killing migration_thread 2010-05-25 11:31 ` Peter Zijlstra @ 2010-05-25 12:10 ` Amit K. Arora 2010-05-25 13:23 ` [PATCH v3] " Amit K. Arora 1 sibling, 0 replies; 19+ messages in thread From: Amit K. Arora @ 2010-05-25 12:10 UTC (permalink / raw) To: Peter Zijlstra Cc: tj, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Tue, May 25, 2010 at 01:31:35PM +0200, Peter Zijlstra wrote: > On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote: > > > > Thus, since above race can never happen, is there any other issue with > > this patch ? > > It doesn't seem to apply nicely... Hi Peter, This patch was built on top of 2.6.34, where it applies cleanly. I checked the git tree and the cpu_stop patchset from Tejun has changed this code quite much (commit 969c79215a35b06e5e3efe69b9412f858df7856c). The change is now required in cpu_stop_cpu_callback() in stop_machine.c. Will post it soon. Thanks! -- Regards, Amit Arora ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3] Make sure timers have migrated before killing migration_thread 2010-05-25 11:31 ` Peter Zijlstra 2010-05-25 12:10 ` Amit K. Arora @ 2010-05-25 13:23 ` Amit K. Arora 2010-05-25 14:22 ` Peter Zijlstra ` (2 more replies) 1 sibling, 3 replies; 19+ messages in thread From: Amit K. Arora @ 2010-05-25 13:23 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar Cc: tj, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Tue, May 25, 2010 at 01:31:35PM +0200, Peter Zijlstra wrote: > On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote: > > > > Thus, since above race can never happen, is there any other issue with > > this patch ? > > It doesn't seem to apply nicely... Here is the new patch. Problem : In a stress test where some heavy tests were running along with regular CPU offlining and onlining, a hang was observed. The system seems to be hung at a point where migration_call() tries to kill the migration_thread of the dying CPU, which just got moved to the current CPU. This migration thread does not get a chance to run (and die) since rt_throttled is set to 1 on current, and it doesn't get cleared as the hrtimer which is supposed to reset the rt bandwidth (sched_rt_period_timer) is tied to the CPU which we just marked dead! Solution : This patch pushes the killing of migration thread to "CPU_POST_DEAD" event. By then all the timers (including sched_rt_period_timer) should have got migrated (along with other callbacks). Thanks! Regards, Amit Arora Signed-off-by: Amit Arora <aarora@in.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> --- diff -Nuarp linux-2.6-next-20100525.org/kernel/stop_machine.c linux-2.6-next-20100525/kernel/stop_machine.c --- linux-2.6-next-20100525.org/kernel/stop_machine.c 2010-05-25 14:27:51.000000000 -0400 +++ linux-2.6-next-20100525/kernel/stop_machine.c 2010-05-25 14:30:09.000000000 -0400 @@ -321,7 +321,7 @@ static int __cpuinit cpu_stop_cpu_callba #ifdef CONFIG_HOTPLUG_CPU case CPU_UP_CANCELED: - case CPU_DEAD: + case CPU_POST_DEAD: { struct cpu_stop_work *work; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3] Make sure timers have migrated before killing migration_thread 2010-05-25 13:23 ` [PATCH v3] " Amit K. Arora @ 2010-05-25 14:22 ` Peter Zijlstra 2010-05-25 16:27 ` Tejun Heo 2010-05-31 7:18 ` [tip:sched/urgent] sched: Make sure timers have migrated before killing the migration_thread tip-bot for Amit K. Arora 2 siblings, 0 replies; 19+ messages in thread From: Peter Zijlstra @ 2010-05-25 14:22 UTC (permalink / raw) To: Amit K. Arora Cc: Ingo Molnar, tj, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel On Tue, 2010-05-25 at 18:53 +0530, Amit K. Arora wrote: nice :-) Thanks! > Signed-off-by: Amit Arora <aarora@in.ibm.com> > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > Cc: Ingo Molnar <mingo@elte.hu> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Tejun Heo <tj@kernel.org> > --- > diff -Nuarp linux-2.6-next-20100525.org/kernel/stop_machine.c linux-2.6-next-20100525/kernel/stop_machine.c > --- linux-2.6-next-20100525.org/kernel/stop_machine.c 2010-05-25 14:27:51.000000000 -0400 > +++ linux-2.6-next-20100525/kernel/stop_machine.c 2010-05-25 14:30:09.000000000 -0400 > @@ -321,7 +321,7 @@ static int __cpuinit cpu_stop_cpu_callba > > #ifdef CONFIG_HOTPLUG_CPU > case CPU_UP_CANCELED: > - case CPU_DEAD: > + case CPU_POST_DEAD: > { > struct cpu_stop_work *work; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3] Make sure timers have migrated before killing migration_thread 2010-05-25 13:23 ` [PATCH v3] " Amit K. Arora 2010-05-25 14:22 ` Peter Zijlstra @ 2010-05-25 16:27 ` Tejun Heo 2010-05-31 7:18 ` [tip:sched/urgent] sched: Make sure timers have migrated before killing the migration_thread tip-bot for Amit K. Arora 2 siblings, 0 replies; 19+ messages in thread From: Tejun Heo @ 2010-05-25 16:27 UTC (permalink / raw) To: Amit K. Arora Cc: Peter Zijlstra, Ingo Molnar, Srivatsa Vaddagiri, Gautham R Shenoy, Darren Hart, Brian King, linux-kernel Acked-by: Tejun Heo <tj@kernel.org> Thanks. -- tejun ^ permalink raw reply [flat|nested] 19+ messages in thread
* [tip:sched/urgent] sched: Make sure timers have migrated before killing the migration_thread 2010-05-25 13:23 ` [PATCH v3] " Amit K. Arora 2010-05-25 14:22 ` Peter Zijlstra 2010-05-25 16:27 ` Tejun Heo @ 2010-05-31 7:18 ` tip-bot for Amit K. Arora 2 siblings, 0 replies; 19+ messages in thread From: tip-bot for Amit K. Arora @ 2010-05-31 7:18 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, ego, aarora, hpa, mingo, a.p.zijlstra, tj, tglx, aarora, mingo Commit-ID: 54e88fad223c4e1d94289611a90c7fe3ebe5631b Gitweb: http://git.kernel.org/tip/54e88fad223c4e1d94289611a90c7fe3ebe5631b Author: Amit K. Arora <aarora@linux.vnet.ibm.com> AuthorDate: Tue, 25 May 2010 18:53:46 +0530 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 31 May 2010 08:37:44 +0200 sched: Make sure timers have migrated before killing the migration_thread Problem: In a stress test where some heavy tests were running along with regular CPU offlining and onlining, a hang was observed. The system seems to be hung at a point where migration_call() tries to kill the migration_thread of the dying CPU, which just got moved to the current CPU. This migration thread does not get a chance to run (and die) since rt_throttled is set to 1 on current, and it doesn't get cleared as the hrtimer which is supposed to reset the rt bandwidth (sched_rt_period_timer) is tied to the CPU which we just marked dead! Solution: This patch pushes the killing of migration thread to "CPU_POST_DEAD" event. By then all the timers (including sched_rt_period_timer) should have got migrated (along with other callbacks). Signed-off-by: Amit Arora <aarora@in.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20100525132346.GA14986@amitarora.in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/stop_machine.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index b4e7431..70f8d90 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -321,7 +321,7 @@ static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb, #ifdef CONFIG_HOTPLUG_CPU case CPU_UP_CANCELED: - case CPU_DEAD: + case CPU_POST_DEAD: { struct cpu_stop_work *work; ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2010-05-31 7:20 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-19 9:05 [PATCH] Make sure timers have migrated before killing migration_thread Amit K. Arora 2010-05-19 9:31 ` Peter Zijlstra 2010-05-19 12:13 ` [PATCH v2] " Amit K. Arora 2010-05-20 7:28 ` Peter Zijlstra 2010-05-23 9:07 ` Mike Galbraith 2010-05-23 9:13 ` Peter Zijlstra 2010-05-24 6:43 ` Amit K. Arora 2010-05-25 20:19 ` Thomas Gleixner 2010-05-26 6:43 ` Peter Zijlstra 2010-05-24 9:59 ` [PATCH] " Amit K. Arora 2010-05-24 13:28 ` Peter Zijlstra 2010-05-24 15:16 ` Srivatsa Vaddagiri 2010-05-24 15:55 ` Peter Zijlstra 2010-05-25 11:31 ` Peter Zijlstra 2010-05-25 12:10 ` Amit K. Arora 2010-05-25 13:23 ` [PATCH v3] " Amit K. Arora 2010-05-25 14:22 ` Peter Zijlstra 2010-05-25 16:27 ` Tejun Heo 2010-05-31 7:18 ` [tip:sched/urgent] sched: Make sure timers have migrated before killing the migration_thread tip-bot for Amit K. Arora
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).