All of lore.kernel.org
 help / color / mirror / Atom feed
* CPU offline/online related hang with latest git
@ 2009-12-20 10:04 Sachin Sant
  2009-12-20 13:31 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Sachin Sant @ 2009-12-20 10:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with
CPU hotplug on multiple architectures. Trying to offline
a CPU results in a machine hang.

Sample o/p from a 4 way x86_64 box :

#:/sys/devices/system/cpu/cpu3 # cat online
1
#:/sys/devices/system/cpu/cpu3 # echo 0 > online
Stack:
Call Trace:
 <IRQ>
 <EOI>
Code: 45 f0 48 89 45 b8 48 8d 45 d0 4c 89 4d f8 c7 45 b0 10 00 00 00 48 89 45 c0 e8 5a ff ff ff c9 c3 89 f0 b9 40 00 00 00 55 99 f7 f9 <48> 89 e5 48 89 f9 48 83 ec 08 31 d2 41 89 c0 eb 12 48 8b 01 48
Stack:
Call Trace:
Code: 20 00 00 00 00 48 89 3e 49 8d 74 24 28 e8 45 75 1d 00 49 c7 44 24 50 00 00 00 00 48 8b 9b 60 01 00 00 48 85 db 0f 85 25 ff ff ff <5b> 41 5c c9 c3 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 81 ec
Stack:
Call Trace:
Code: 48 0f 45 d8 eb 1a 44 0f a3 20 19 c0 48 c7 c3 10 b5 a0 81 85 c0 48 c7 c0 30 b5 a0 81 48 0f 44 d8 31 c0 f3 90 44 8b 25 82 35 96 00 <41> 39 c4 74 46 41 83 fc 02 74 08 41 83 fc 03 75 12 eb 03 fa eb
Stack:
Call Trace:
Code: 48 c7 c0 30 b5 a0 81 48 0f 45 d8 eb 1a 44 0f a3 20 19 c0 48 c7 c3 10 b5 a0 81 85 c0 48 c7 c0 30 b5 a0 81 48 0f 44 d8 31 c0 f3 90 <44> 8b 25 82 35 96 00 41 39 c4 74 46 41 83 fc 02 74 08 41 83 fc
Uhhuh. NMI received for unknown reason 25 on CPU 0.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue

The above messages are repeated after this. I observed similar hangs
on other architectures as well (s390x, PowerPC, x86_32).

2.6.33-rc1 worked fine. I haven't tried the bisect. Will do that
first thing tomorrow morning. 

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: CPU offline/online related hang with latest git
  2009-12-20 10:04 CPU offline/online related hang with latest git Sachin Sant
@ 2009-12-20 13:31 ` Peter Zijlstra
  2009-12-20 14:51   ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2009-12-20 13:31 UTC (permalink / raw)
  To: Sachin Sant
  Cc: linux-kernel, Ingo Molnar, Heiko Carstens, Benjamin Herrenschmidt

On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote:
> With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with
> CPU hotplug on multiple architectures. Trying to offline
> a CPU results in a machine hang.

damnit, you're right.. it's getting stuck during hotplug on stop machine
some place.

didn't notice it because it didn't crash.. 

sorry for that :-(


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: CPU offline/online related hang with latest git
  2009-12-20 13:31 ` Peter Zijlstra
@ 2009-12-20 14:51   ` Jens Axboe
  2009-12-20 14:54     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2009-12-20 14:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

On Sun, Dec 20 2009, Peter Zijlstra wrote:
> On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote:
> > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with
> > CPU hotplug on multiple architectures. Trying to offline
> > a CPU results in a machine hang.
> 
> damnit, you're right.. it's getting stuck during hotplug on stop machine
> some place.
> 
> didn't notice it because it didn't crash.. 
> 
> sorry for that :-(

Perhaps this is also why suspend doesn't work on current -git... -rc1 is
fine.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: CPU offline/online related hang with latest git
  2009-12-20 14:51   ` Jens Axboe
@ 2009-12-20 14:54     ` Peter Zijlstra
  2009-12-20 14:57       ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2009-12-20 14:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

On Sun, 2009-12-20 at 15:51 +0100, Jens Axboe wrote:
> On Sun, Dec 20 2009, Peter Zijlstra wrote:
> > On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote:
> > > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with
> > > CPU hotplug on multiple architectures. Trying to offline
> > > a CPU results in a machine hang.
> > 
> > damnit, you're right.. it's getting stuck during hotplug on stop machine
> > some place.
> > 
> > didn't notice it because it didn't crash.. 
> > 
> > sorry for that :-(
> 
> Perhaps this is also why suspend doesn't work on current -git... -rc1 is
> fine.

Yep, suspend relies on hotplug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: CPU offline/online related hang with latest git
  2009-12-20 14:54     ` Peter Zijlstra
@ 2009-12-20 14:57       ` Jens Axboe
  2009-12-20 15:00         ` Peter Zijlstra
  2009-12-20 16:36         ` [PATCH] sched: Fix hotplug Peter Zijlstra
  0 siblings, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2009-12-20 14:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

On Sun, Dec 20 2009, Peter Zijlstra wrote:
> On Sun, 2009-12-20 at 15:51 +0100, Jens Axboe wrote:
> > On Sun, Dec 20 2009, Peter Zijlstra wrote:
> > > On Sun, 2009-12-20 at 15:34 +0530, Sachin Sant wrote:
> > > > With 2.6.33-rc1-git1 (dd59f6c..) i am having trouble with
> > > > CPU hotplug on multiple architectures. Trying to offline
> > > > a CPU results in a machine hang.
> > > 
> > > damnit, you're right.. it's getting stuck during hotplug on stop machine
> > > some place.
> > > 
> > > didn't notice it because it didn't crash.. 
> > > 
> > > sorry for that :-(
> > 
> > Perhaps this is also why suspend doesn't work on current -git... -rc1 is
> > fine.
> 
> Yep, suspend relies on hotplug.

Exactly, hence the connection. Shall I bisect, or do you already know
what the problem is?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: CPU offline/online related hang with latest git
  2009-12-20 14:57       ` Jens Axboe
@ 2009-12-20 15:00         ` Peter Zijlstra
  2009-12-20 16:36         ` [PATCH] sched: Fix hotplug Peter Zijlstra
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2009-12-20 15:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

On Sun, 2009-12-20 at 15:57 +0100, Jens Axboe wrote:

> Exactly, hence the connection. Shall I bisect, or do you already know
> what the problem is?

I have a definite suspect alright, let me prod at this a bit more.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] sched: Fix hotplug
  2009-12-20 14:57       ` Jens Axboe
  2009-12-20 15:00         ` Peter Zijlstra
@ 2009-12-20 16:36         ` Peter Zijlstra
  2009-12-20 22:14           ` Jens Axboe
  2009-12-20 22:33           ` [tip:sched/urgent] sched: Fix hotplug hang tip-bot for Peter Zijlstra
  1 sibling, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2009-12-20 16:36 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

The hot-unplug kstopmachine usage does a wakeup after deactivating the
cpu, hence we cannot use cpu_active() here but must rely on the good
olde online.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 720df10..0ac4fa5 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2348,7 +2348,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
 	 *   not worry about this generic constraint ]
 	 */
 	if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
-		     !cpu_active(cpu)))
+		     !cpu_online(cpu)))
 		cpu = select_fallback_rq(task_cpu(p), p);
 
 	return cpu;



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix hotplug
  2009-12-20 16:36         ` [PATCH] sched: Fix hotplug Peter Zijlstra
@ 2009-12-20 22:14           ` Jens Axboe
  2009-12-20 22:33           ` [tip:sched/urgent] sched: Fix hotplug hang tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2009-12-20 22:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sachin Sant, linux-kernel, Ingo Molnar, Heiko Carstens,
	Benjamin Herrenschmidt

On Sun, Dec 20 2009, Peter Zijlstra wrote:
> The hot-unplug kstopmachine usage does a wakeup after deactivating the
> cpu, hence we cannot use cpu_active() here but must rely on the good
> olde online.

Yep, this works for me!

Tested-by: Jens Axboe <jens.axboe@oracle.com>

> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  kernel/sched.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 720df10..0ac4fa5 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2348,7 +2348,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
>  	 *   not worry about this generic constraint ]
>  	 */
>  	if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
> -		     !cpu_active(cpu)))
> +		     !cpu_online(cpu)))
>  		cpu = select_fallback_rq(task_cpu(p), p);
>  
>  	return cpu;
> 
> 

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:sched/urgent] sched: Fix hotplug hang
  2009-12-20 16:36         ` [PATCH] sched: Fix hotplug Peter Zijlstra
  2009-12-20 22:14           ` Jens Axboe
@ 2009-12-20 22:33           ` tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-12-20 22:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, peterz, sachinp,
	jens.axboe, benh, heiko.carstens, tglx, mingo

Commit-ID:  70f1120527797adb31c68bdc6f1b45e182c342c7
Gitweb:     http://git.kernel.org/tip/70f1120527797adb31c68bdc6f1b45e182c342c7
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Sun, 20 Dec 2009 17:36:27 +0100
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 20 Dec 2009 23:31:23 +0100

sched: Fix hotplug hang

The hot-unplug kstopmachine usage does a wakeup after
deactivating the cpu, hence we cannot use cpu_active()
here but must rely on the good olde online.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <1261326987.4314.24.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 7ffde2a..87f1f47 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2346,7 +2346,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
 	 *   not worry about this generic constraint ]
 	 */
 	if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
-		     !cpu_active(cpu)))
+		     !cpu_online(cpu)))
 		cpu = select_fallback_rq(task_cpu(p), p);
 
 	return cpu;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-12-20 22:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-20 10:04 CPU offline/online related hang with latest git Sachin Sant
2009-12-20 13:31 ` Peter Zijlstra
2009-12-20 14:51   ` Jens Axboe
2009-12-20 14:54     ` Peter Zijlstra
2009-12-20 14:57       ` Jens Axboe
2009-12-20 15:00         ` Peter Zijlstra
2009-12-20 16:36         ` [PATCH] sched: Fix hotplug Peter Zijlstra
2009-12-20 22:14           ` Jens Axboe
2009-12-20 22:33           ` [tip:sched/urgent] sched: Fix hotplug hang tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.