linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM: smp: return from ipi_cpu_stop()
@ 2013-01-05 13:05 Ulrich Hecht
  2013-01-05 14:09 ` Russell King - ARM Linux
  0 siblings, 1 reply; 5+ messages in thread
From: Ulrich Hecht @ 2013-01-05 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

ipi_cpu_stop() is supposed to mark a CPU offline, but it's the job of the
idle thread to actually shut it down. It seems, however, that the idle
thread doesn't get a chance to run because ipi_cpu_stop() disables all
interrupts and does not return.  This keeps the platform-specific shutdown
code from being executed, which causes big trouble when kexec'ing an SMP
kernel.

This patch fixes the issue by removing the endless loop. Tested on a KZM9G
board.

Signed-off-by: Ulrich Hecht <ulrich.hecht@gmail.com>
---
 arch/arm/kernel/smp.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 84f4cbf..e72b025 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -571,9 +571,6 @@ static void ipi_cpu_stop(unsigned int cpu)
 
 	local_fiq_disable();
 	local_irq_disable();
-
-	while (1)
-		cpu_relax();
 }
 
 /*
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] ARM: smp: return from ipi_cpu_stop()
  2013-01-05 13:05 [PATCH] ARM: smp: return from ipi_cpu_stop() Ulrich Hecht
@ 2013-01-05 14:09 ` Russell King - ARM Linux
  2013-01-05 14:49   ` Ulrich Hecht
  2013-01-05 15:08   ` Ulrich Hecht
  0 siblings, 2 replies; 5+ messages in thread
From: Russell King - ARM Linux @ 2013-01-05 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 05, 2013 at 02:05:23PM +0100, Ulrich Hecht wrote:
> ipi_cpu_stop() is supposed to mark a CPU offline, but it's the job of the
> idle thread to actually shut it down.

Wrong.  Where do you get that idea from?

If you look at what x86 does, you'll notice that it too puts the CPU
into a busy loop.

Plus, there's no guarantee what so ever that anything will happen in the
idle loop; this is not the _normal_ CPU shutdown path, this is the "oh
fsck, we need to stop all other CPUs because we panic'd" handler.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] ARM: smp: return from ipi_cpu_stop()
  2013-01-05 14:09 ` Russell King - ARM Linux
@ 2013-01-05 14:49   ` Ulrich Hecht
  2013-01-05 15:34     ` Russell King - ARM Linux
  2013-01-05 15:08   ` Ulrich Hecht
  1 sibling, 1 reply; 5+ messages in thread
From: Ulrich Hecht @ 2013-01-05 14:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 5, 2013 at 3:09 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Plus, there's no guarantee what so ever that anything will happen in the
> idle loop; this is not the _normal_ CPU shutdown path, this is the "oh
> fsck, we need to stop all other CPUs because we panic'd" handler.

So who's job is it then to call cpu_die()? Because that's what doesn't
happen and leaves the second core spinning, wrecking the kexec'd
kernel.

CU
Uli

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] ARM: smp: return from ipi_cpu_stop()
  2013-01-05 14:09 ` Russell King - ARM Linux
  2013-01-05 14:49   ` Ulrich Hecht
@ 2013-01-05 15:08   ` Ulrich Hecht
  1 sibling, 0 replies; 5+ messages in thread
From: Ulrich Hecht @ 2013-01-05 15:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 5, 2013 at 3:09 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 05, 2013 at 02:05:23PM +0100, Ulrich Hecht wrote:
>> ipi_cpu_stop() is supposed to mark a CPU offline, but it's the job of the
>> idle thread to actually shut it down.
>
> Wrong.  Where do you get that idea from?

I think it's because that's where the only code path ending up at
shmobile_cpu_die() starts:
cpu_idle() -> cpu_die() -> platform_cpu_die() -> smp_ops.cpu_die()
(which is set to shmobile_cpu_die() on my board).

CU
Uli

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] ARM: smp: return from ipi_cpu_stop()
  2013-01-05 14:49   ` Ulrich Hecht
@ 2013-01-05 15:34     ` Russell King - ARM Linux
  0 siblings, 0 replies; 5+ messages in thread
From: Russell King - ARM Linux @ 2013-01-05 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 05, 2013 at 03:49:33PM +0100, Ulrich Hecht wrote:
> On Sat, Jan 5, 2013 at 3:09 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Plus, there's no guarantee what so ever that anything will happen in the
> > idle loop; this is not the _normal_ CPU shutdown path, this is the "oh
> > fsck, we need to stop all other CPUs because we panic'd" handler.
> 
> So who's job is it then to call cpu_die()? Because that's what doesn't
> happen and leaves the second core spinning, wrecking the kexec'd
> kernel.

There's been discussions about how kexec should be handled, and it's
still up in the air - mainly because this area has been left totally
open architecturally about how it should be handled.  Many platforms
don't provide any support what so ever for dealing with the secondary
CPUs once they've been woken up, which makes the problem of what to
do with a running CPU across a kexec event extremely difficult to
deal with.

There's an idea to call disable_nonboot_cpus() from machine_shutdown(),
but this results in that function being called multiple times by some
paths, such as the restart path.

This suffers from the problem that machine_restart() is called from
non-process contexts, which means that sleeping in _cpu_down() (called
via disable_nonboot_cpus()) really isn't allowable.

It's really a bitch of a problem; if we had somewhere to put the CPUs
(eg, back into the boot loader) for every platform, then we wouldn't
have a problem - unfortunately we don't know if we still have a boot
loader around once the kernel is running.

But alas, such foresight is rather lacking.

All the time that we have random different platforms going off and doing
totally random different things, we're going to constantly hit these
kinds of problems.  I don't think it's going to be long before these
cause such a problem that we basically say "no, we're not going to make
it work on platform X because platform X is just being stupid.  Get
with the program, work with the community and design something sensible
instead."

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-01-05 15:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-05 13:05 [PATCH] ARM: smp: return from ipi_cpu_stop() Ulrich Hecht
2013-01-05 14:09 ` Russell King - ARM Linux
2013-01-05 14:49   ` Ulrich Hecht
2013-01-05 15:34     ` Russell King - ARM Linux
2013-01-05 15:08   ` Ulrich Hecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).