* CPU offline/online related hang with latest git @ 2009-12-20 10:04 Sachin Sant 2009-12-20 13:31 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Sachin Sant @ 2009-12-20 10:04 UTC (permalink / raw) To: linux-kernel Cc: Peter Zijlstra, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with CPU hotplug on multiple architectures. Trying to offline a CPU results in a machine hang. Sample o/p from a 4 way x86_64 box : #:/sys/devices/system/cpu/cpu3 # cat online 1 #:/sys/devices/system/cpu/cpu3 # echo 0 > online Stack: Call Trace: <IRQ> <EOI> Code: 45 f0 48 89 45 b8 48 8d 45 d0 4c 89 4d f8 c7 45 b0 10 00 00 00 48 89 45 c0 e8 5a ff ff ff c9 c3 89 f0 b9 40 00 00 00 55 99 f7 f9 <48> 89 e5 48 89 f9 48 83 ec 08 31 d2 41 89 c0 eb 12 48 8b 01 48 Stack: Call Trace: Code: 20 00 00 00 00 48 89 3e 49 8d 74 24 28 e8 45 75 1d 00 49 c7 44 24 50 00 00 00 00 48 8b 9b 60 01 00 00 48 85 db 0f 85 25 ff ff ff <5b> 41 5c c9 c3 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 81 ec Stack: Call Trace: Code: 48 0f 45 d8 eb 1a 44 0f a3 20 19 c0 48 c7 c3 10 b5 a0 81 85 c0 48 c7 c0 30 b5 a0 81 48 0f 44 d8 31 c0 f3 90 44 8b 25 82 35 96 00 <41> 39 c4 74 46 41 83 fc 02 74 08 41 83 fc 03 75 12 eb 03 fa eb Stack: Call Trace: Code: 48 c7 c0 30 b5 a0 81 48 0f 45 d8 eb 1a 44 0f a3 20 19 c0 48 c7 c3 10 b5 a0 81 85 c0 48 c7 c0 30 b5 a0 81 48 0f 44 d8 31 c0 f3 90 <44> 8b 25 82 35 96 00 41 39 c4 74 46 41 83 fc 02 74 08 41 83 fc Uhhuh. NMI received for unknown reason 25 on CPU 0. Do you have a strange power saving mode enabled? Dazed and confused, but trying to continue The above messages are repeated after this. I observed similar hangs on other architectures as well (s390x, PowerPC, x86_32). 2.6.33-rc1 worked fine. I haven't tried the bisect. Will do that first thing tomorrow morning. Thanks -Sachin -- --------------------------------- Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India --------------------------------- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CPU offline/online related hang with latest git 2009-12-20 10:04 CPU offline/online related hang with latest git Sachin Sant @ 2009-12-20 13:31 ` Peter Zijlstra 2009-12-20 14:51 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2009-12-20 13:31 UTC (permalink / raw) To: Sachin Sant Cc: linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote: > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with > CPU hotplug on multiple architectures. Trying to offline > a CPU results in a machine hang. damnit, you're right.. it's getting stuck during hotplug on stop machine some place. didn't notice it because it didn't crash.. sorry for that :-( ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CPU offline/online related hang with latest git 2009-12-20 13:31 ` Peter Zijlstra @ 2009-12-20 14:51 ` Jens Axboe 2009-12-20 14:54 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2009-12-20 14:51 UTC (permalink / raw) To: Peter Zijlstra Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt On Sun, Dec 20 2009, Peter Zijlstra wrote: > On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote: > > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with > > CPU hotplug on multiple architectures. Trying to offline > > a CPU results in a machine hang. > > damnit, you're right.. it's getting stuck during hotplug on stop machine > some place. > > didn't notice it because it didn't crash.. > > sorry for that :-( Perhaps this is also why suspend doesn't work on current -git... -rc1 is fine. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CPU offline/online related hang with latest git 2009-12-20 14:51 ` Jens Axboe @ 2009-12-20 14:54 ` Peter Zijlstra 2009-12-20 14:57 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2009-12-20 14:54 UTC (permalink / raw) To: Jens Axboe Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt On Sun, 2009-12-20 at 15:51 +0100, Jens Axboe wrote: > On Sun, Dec 20 2009, Peter Zijlstra wrote: > > On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote: > > > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with > > > CPU hotplug on multiple architectures. Trying to offline > > > a CPU results in a machine hang. > > > > damnit, you're right.. it's getting stuck during hotplug on stop machine > > some place. > > > > didn't notice it because it didn't crash.. > > > > sorry for that :-( > > Perhaps this is also why suspend doesn't work on current -git... -rc1 is > fine. Yep, suspend relies on hotplug. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CPU offline/online related hang with latest git 2009-12-20 14:54 ` Peter Zijlstra @ 2009-12-20 14:57 ` Jens Axboe 2009-12-20 15:00 ` Peter Zijlstra 2009-12-20 16:36 ` [PATCH] sched: Fix hotplug Peter Zijlstra 0 siblings, 2 replies; 9+ messages in thread From: Jens Axboe @ 2009-12-20 14:57 UTC (permalink / raw) To: Peter Zijlstra Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt On Sun, Dec 20 2009, Peter Zijlstra wrote: > On Sun, 2009-12-20 at 15:51 +0100, Jens Axboe wrote: > > On Sun, Dec 20 2009, Peter Zijlstra wrote: > > > On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote: > > > > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with > > > > CPU hotplug on multiple architectures. Trying to offline > > > > a CPU results in a machine hang. > > > > > > damnit, you're right.. it's getting stuck during hotplug on stop machine > > > some place. > > > > > > didn't notice it because it didn't crash.. > > > > > > sorry for that :-( > > > > Perhaps this is also why suspend doesn't work on current -git... -rc1 is > > fine. > > Yep, suspend relies on hotplug. Exactly, hence the connection. Shall I bisect, or do you already know what the problem is? -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CPU offline/online related hang with latest git 2009-12-20 14:57 ` Jens Axboe @ 2009-12-20 15:00 ` Peter Zijlstra 2009-12-20 16:36 ` [PATCH] sched: Fix hotplug Peter Zijlstra 1 sibling, 0 replies; 9+ messages in thread From: Peter Zijlstra @ 2009-12-20 15:00 UTC (permalink / raw) To: Jens Axboe Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt On Sun, 2009-12-20 at 15:57 +0100, Jens Axboe wrote: > Exactly, hence the connection. Shall I bisect, or do you already know > what the problem is? I have a definite suspect alright, let me prod at this a bit more. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] sched: Fix hotplug 2009-12-20 14:57 ` Jens Axboe 2009-12-20 15:00 ` Peter Zijlstra @ 2009-12-20 16:36 ` Peter Zijlstra 2009-12-20 22:14 ` Jens Axboe 2009-12-20 22:33 ` [tip:sched/urgent] sched: Fix hotplug hang tip-bot for Peter Zijlstra 1 sibling, 2 replies; 9+ messages in thread From: Peter Zijlstra @ 2009-12-20 16:36 UTC (permalink / raw) To: Jens Axboe Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt The hot-unplug kstopmachine usage does a wakeup after deactivating the cpu, hence we cannot use cpu_active() here but must rely on the good olde online. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- kernel/sched.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 720df10..0ac4fa5 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2348,7 +2348,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags) * not worry about this generic constraint ] */ if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) || - !cpu_active(cpu))) + !cpu_online(cpu))) cpu = select_fallback_rq(task_cpu(p), p); return cpu; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix hotplug 2009-12-20 16:36 ` [PATCH] sched: Fix hotplug Peter Zijlstra @ 2009-12-20 22:14 ` Jens Axboe 2009-12-20 22:33 ` [tip:sched/urgent] sched: Fix hotplug hang tip-bot for Peter Zijlstra 1 sibling, 0 replies; 9+ messages in thread From: Jens Axboe @ 2009-12-20 22:14 UTC (permalink / raw) To: Peter Zijlstra Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt On Sun, Dec 20 2009, Peter Zijlstra wrote: > The hot-unplug kstopmachine usage does a wakeup after deactivating the > cpu, hence we cannot use cpu_active() here but must rely on the good > olde online. Yep, this works for me! Tested-by: Jens Axboe <jens.axboe@oracle.com> > > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > --- > kernel/sched.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index 720df10..0ac4fa5 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -2348,7 +2348,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags) > * not worry about this generic constraint ] > */ > if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) || > - !cpu_active(cpu))) > + !cpu_online(cpu))) > cpu = select_fallback_rq(task_cpu(p), p); > > return cpu; > > -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:sched/urgent] sched: Fix hotplug hang 2009-12-20 16:36 ` [PATCH] sched: Fix hotplug Peter Zijlstra 2009-12-20 22:14 ` Jens Axboe @ 2009-12-20 22:33 ` tip-bot for Peter Zijlstra 1 sibling, 0 replies; 9+ messages in thread From: tip-bot for Peter Zijlstra @ 2009-12-20 22:33 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, a.p.zijlstra, peterz, sachinp, jens.axboe, benh, heiko.carstens, tglx, mingo Commit-ID: 70f1120527797adb31c68bdc6f1b45e182c342c7 Gitweb: http://git.kernel.org/tip/70f1120527797adb31c68bdc6f1b45e182c342c7 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Sun, 20 Dec 2009 17:36:27 +0100 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Sun, 20 Dec 2009 23:31:23 +0100 sched: Fix hotplug hang The hot-unplug kstopmachine usage does a wakeup after deactivating the cpu, hence we cannot use cpu_active() here but must rely on the good olde online. Reported-by: Sachin Sant <sachinp@in.ibm.com> Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Jens Axboe <jens.axboe@oracle.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1261326987.4314.24.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/sched.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 7ffde2a..87f1f47 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2346,7 +2346,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags) * not worry about this generic constraint ] */ if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) || - !cpu_active(cpu))) + !cpu_online(cpu))) cpu = select_fallback_rq(task_cpu(p), p); return cpu; ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-12-20 22:34 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-20 10:04 CPU offline/online related hang with latest git Sachin Sant 2009-12-20 13:31 ` Peter Zijlstra 2009-12-20 14:51 ` Jens Axboe 2009-12-20 14:54 ` Peter Zijlstra 2009-12-20 14:57 ` Jens Axboe 2009-12-20 15:00 ` Peter Zijlstra 2009-12-20 16:36 ` [PATCH] sched: Fix hotplug Peter Zijlstra 2009-12-20 22:14 ` Jens Axboe 2009-12-20 22:33 ` [tip:sched/urgent] sched: Fix hotplug hang tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox