linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Is Pandaboard cpuhotplug working stably?
@ 2011-12-21  9:23 Barry Song
  2011-12-21  9:46 ` Russell King - ARM Linux
  0 siblings, 1 reply; 9+ messages in thread
From: Barry Song @ 2011-12-21  9:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi guys,
i tried cpuhotplug on pandaboard for both
Pandroid_Froyo_L27.8.2_release_pkg and Linaro 11.11. It has failed to
work stably.
On Pandroid_Froyo_L27.8.2_release_pkg, unplugging cpu1 works well:
# echo 0 > /sys/devices/system/cpu/cpu1/online
CPU1: shutdown

if i enable the cpu1 again by "echo 1 >
/sys/devices/system/cpu/cpu1/online", the system will restore to 3
random status: hang, normal, panic.

Using  Linaro 11.11 release, "echo 0 >
/sys/devices/system/cpu/cpu1/online" will make system hang and the
whole system will not be able to reset by pressing reset key, the only
way to reset system is pulling out AV power.

i am sorry i can't get more time to debug and find more clues. just
want to ask people whether this is a version the cpuhotplug works
normal on?

Thanks
barry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-21  9:23 Is Pandaboard cpuhotplug working stably? Barry Song
@ 2011-12-21  9:46 ` Russell King - ARM Linux
  2011-12-21  9:59   ` Barry Song
  0 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2011-12-21  9:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 21, 2011 at 05:23:48PM +0800, Barry Song wrote:
> Hi guys,
> i tried cpuhotplug on pandaboard for both
> Pandroid_Froyo_L27.8.2_release_pkg and Linaro 11.11. It has failed to
> work stably.
> On Pandroid_Froyo_L27.8.2_release_pkg, unplugging cpu1 works well:
> # echo 0 > /sys/devices/system/cpu/cpu1/online
> CPU1: shutdown
> 
> if i enable the cpu1 again by "echo 1 >
> /sys/devices/system/cpu/cpu1/online", the system will restore to 3
> random status: hang, normal, panic.
> 
> Using  Linaro 11.11 release, "echo 0 >
> /sys/devices/system/cpu/cpu1/online" will make system hang and the
> whole system will not be able to reset by pressing reset key, the only
> way to reset system is pulling out AV power.
> 
> i am sorry i can't get more time to debug and find more clues. just
> want to ask people whether this is a version the cpuhotplug works
> normal on?

cpu hotplug is basically totally buggered - the preconditions placed
upon the bringup code path are basically impossible to satisfy in any
shape or form at the moment.

There's the requirement that the secondary CPU is marked online and
active before interrupts are enabled for the thread migration stuff
to behave correctly.  However, this is incompatible with smp_call_function()
which will wait for online CPUs to respond to an IPI - which this one
won't because interrupts are disabled.

I think there was some discussion about how to fix this but I don't
recall the details.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-21  9:46 ` Russell King - ARM Linux
@ 2011-12-21  9:59   ` Barry Song
  2011-12-21 10:07     ` Russell King - ARM Linux
  0 siblings, 1 reply; 9+ messages in thread
From: Barry Song @ 2011-12-21  9:59 UTC (permalink / raw)
  To: linux-arm-kernel

2011/12/21 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> On Wed, Dec 21, 2011 at 05:23:48PM +0800, Barry Song wrote:
>> Hi guys,
>> i tried cpuhotplug on pandaboard for both
>> Pandroid_Froyo_L27.8.2_release_pkg and Linaro 11.11. It has failed to
>> work stably.
>> On Pandroid_Froyo_L27.8.2_release_pkg, unplugging cpu1 works well:
>> # echo 0 > /sys/devices/system/cpu/cpu1/online
>> CPU1: shutdown
>>
>> if i enable the cpu1 again by "echo 1 >
>> /sys/devices/system/cpu/cpu1/online", the system will restore to 3
>> random status: hang, normal, panic.
>>
>> Using ?Linaro 11.11 release, "echo 0 >
>> /sys/devices/system/cpu/cpu1/online" will make system hang and the
>> whole system will not be able to reset by pressing reset key, the only
>> way to reset system is pulling out AV power.
>>
>> i am sorry i can't get more time to debug and find more clues. just
>> want to ask people whether this is a version the cpuhotplug works
>> normal on?
>
> cpu hotplug is basically totally buggered - the preconditions placed
> upon the bringup code path are basically impossible to satisfy in any
> shape or form at the moment.
>
> There's the requirement that the secondary CPU is marked online and
> active before interrupts are enabled for the thread migration stuff
> to behave correctly. ?However, this is incompatible with smp_call_function()
> which will wait for online CPUs to respond to an IPI - which this one
> won't because interrupts are disabled.
>
> I think there was some discussion about how to fix this but I don't
> recall the details.

thanks, Russell. then could i think this is an ARM-kernel-specific bug
which exists on all ARM SMP chips for the moment?
and that bug doesn't happen on x86:
root at ubuntu:~/simple-rootfs/initrd/bin# echo 0 >
/sys/devices/system/cpu/cpu3/online
root at ubuntu:~/simple-rootfs/initrd/bin# echo 1 >
/sys/devices/system/cpu/cpu3/online
root at ubuntu:~/simple-rootfs/initrd/bin# echo 0 >
/sys/devices/system/cpu/cpu2/online
root at ubuntu:~/simple-rootfs/initrd/bin# echo 1 >
/sys/devices/system/cpu/cpu2/online

-barry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-21  9:59   ` Barry Song
@ 2011-12-21 10:07     ` Russell King - ARM Linux
  2011-12-22  8:49       ` Shilimkar, Santosh
  0 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2011-12-21 10:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 21, 2011 at 05:59:07PM +0800, Barry Song wrote:
> 2011/12/21 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> > cpu hotplug is basically totally buggered - the preconditions placed
> > upon the bringup code path are basically impossible to satisfy in any
> > shape or form at the moment.
> >
> > There's the requirement that the secondary CPU is marked online and
> > active before interrupts are enabled for the thread migration stuff
> > to behave correctly. ?However, this is incompatible with smp_call_function()
> > which will wait for online CPUs to respond to an IPI - which this one
> > won't because interrupts are disabled.
> >
> > I think there was some discussion about how to fix this but I don't
> > recall the details.
> 
> thanks, Russell. then could i think this is an ARM-kernel-specific bug
> which exists on all ARM SMP chips for the moment?
> and that bug doesn't happen on x86:

I don't think so.  There's nothing ARM specific about it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-21 10:07     ` Russell King - ARM Linux
@ 2011-12-22  8:49       ` Shilimkar, Santosh
  2011-12-22 10:24         ` Russell King - ARM Linux
  0 siblings, 1 reply; 9+ messages in thread
From: Shilimkar, Santosh @ 2011-12-22  8:49 UTC (permalink / raw)
  To: linux-arm-kernel

+ Peter Z

On Wed, Dec 21, 2011 at 3:37 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Dec 21, 2011 at 05:59:07PM +0800, Barry Song wrote:
>> 2011/12/21 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>> > cpu hotplug is basically totally buggered - the preconditions placed
>> > upon the bringup code path are basically impossible to satisfy in any
>> > shape or form at the moment.
>> >
>> > There's the requirement that the secondary CPU is marked online and
>> > active before interrupts are enabled for the thread migration stuff
>> > to behave correctly. ?However, this is incompatible with smp_call_function()
>> > which will wait for online CPUs to respond to an IPI - which this one
>> > won't because interrupts are disabled.
>> >
>> > I think there was some discussion about how to fix this but I don't
>> > recall the details.
>>
>> thanks, Russell. then could i think this is an ARM-kernel-specific bug
>> which exists on all ARM SMP chips for the moment?
>> and that bug doesn't happen on x86:
>
> I don't think so. ?There's nothing ARM specific about it.

There are few patches floating around for this issue. I posted one version
long back [1] and then there was one more form Thomas G.
The most recent is from one is from Peter Z [2] which is moving the
fix for the cup online race to core code.

Can you try Peter's patch with your test-case ?

Regards,
Santosh

[1] https://lkml.org/lkml/2011/6/20/79
[2] https://lkml.org/lkml/2011/12/15/255

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-22  8:49       ` Shilimkar, Santosh
@ 2011-12-22 10:24         ` Russell King - ARM Linux
  2011-12-22 10:27           ` Shilimkar, Santosh
  2011-12-27  4:49           ` Varun Wadekar
  0 siblings, 2 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2011-12-22 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 22, 2011 at 02:19:23PM +0530, Shilimkar, Santosh wrote:
> + Peter Z
> 
> On Wed, Dec 21, 2011 at 3:37 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Wed, Dec 21, 2011 at 05:59:07PM +0800, Barry Song wrote:
> >> 2011/12/21 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> >> > cpu hotplug is basically totally buggered - the preconditions placed
> >> > upon the bringup code path are basically impossible to satisfy in any
> >> > shape or form at the moment.
> >> >
> >> > There's the requirement that the secondary CPU is marked online and
> >> > active before interrupts are enabled for the thread migration stuff
> >> > to behave correctly. ?However, this is incompatible with smp_call_function()
> >> > which will wait for online CPUs to respond to an IPI - which this one
> >> > won't because interrupts are disabled.
> >> >
> >> > I think there was some discussion about how to fix this but I don't
> >> > recall the details.
> >>
> >> thanks, Russell. then could i think this is an ARM-kernel-specific bug
> >> which exists on all ARM SMP chips for the moment?
> >> and that bug doesn't happen on x86:
> >
> > I don't think so. ?There's nothing ARM specific about it.
> 
> There are few patches floating around for this issue. I posted one version
> long back [1] and then there was one more form Thomas G.
> The most recent is from one is from Peter Z [2] which is moving the
> fix for the cup online race to core code.
> 
> Can you try Peter's patch with your test-case ?
> 
> Regards,
> Santosh
> 
> [1] https://lkml.org/lkml/2011/6/20/79
> [2] https://lkml.org/lkml/2011/12/15/255

[1] is already fixed - and is not the latest "problem" with this code.
Fixing the problem in [1] actually itself created the latest problem
with smp_call_function() which wasn't there before this change.  Patch
[2] refers to this problem and proposes a fix for it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-22 10:24         ` Russell King - ARM Linux
@ 2011-12-22 10:27           ` Shilimkar, Santosh
  2011-12-27  4:49           ` Varun Wadekar
  1 sibling, 0 replies; 9+ messages in thread
From: Shilimkar, Santosh @ 2011-12-22 10:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 22, 2011 at 3:54 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Dec 22, 2011 at 02:19:23PM +0530, Shilimkar, Santosh wrote:
>> + Peter Z
>>
>> On Wed, Dec 21, 2011 at 3:37 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Wed, Dec 21, 2011 at 05:59:07PM +0800, Barry Song wrote:
>> >> 2011/12/21 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>> >> > cpu hotplug is basically totally buggered - the preconditions placed
>> >> > upon the bringup code path are basically impossible to satisfy in any
>> >> > shape or form at the moment.
>> >> >
>> >> > There's the requirement that the secondary CPU is marked online and
>> >> > active before interrupts are enabled for the thread migration stuff
>> >> > to behave correctly. ?However, this is incompatible with smp_call_function()
>> >> > which will wait for online CPUs to respond to an IPI - which this one
>> >> > won't because interrupts are disabled.
>> >> >
>> >> > I think there was some discussion about how to fix this but I don't
>> >> > recall the details.
>> >>
>> >> thanks, Russell. then could i think this is an ARM-kernel-specific bug
>> >> which exists on all ARM SMP chips for the moment?
>> >> and that bug doesn't happen on x86:
>> >
>> > I don't think so. ?There's nothing ARM specific about it.
>>
>> There are few patches floating around for this issue. I posted one version
>> long back [1] and then there was one more form Thomas G.
>> The most recent is from one is from Peter Z [2] which is moving the
>> fix for the cup online race to core code.
>>
>> Can you try Peter's patch with your test-case ?
>>
>> Regards,
>> Santosh
>>
>> [1] https://lkml.org/lkml/2011/6/20/79
>> [2] https://lkml.org/lkml/2011/12/15/255
>
> [1] is already fixed - and is not the latest "problem" with this code.
> Fixing the problem in [1] actually itself created the latest problem
> with smp_call_function() which wasn't there before this change. ?Patch
> [2] refers to this problem and proposes a fix for it.

Thanks Russell for information. Looks like I missed in between thread.

Regards
Santosh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-22 10:24         ` Russell King - ARM Linux
  2011-12-22 10:27           ` Shilimkar, Santosh
@ 2011-12-27  4:49           ` Varun Wadekar
  2012-01-03 17:21             ` Russell King - ARM Linux
  1 sibling, 1 reply; 9+ messages in thread
From: Varun Wadekar @ 2011-12-27  4:49 UTC (permalink / raw)
  To: linux-arm-kernel


>> There are few patches floating around for this issue. I posted one version
>> long back [1] and then there was one more form Thomas G.
>> The most recent is from one is from Peter Z [2] which is moving the
>> fix for the cup online race to core code.
>>
>> Can you try Peter's patch with your test-case ?
>>
>> Regards,
>> Santosh
>>
>> [1] https://lkml.org/lkml/2011/6/20/79
>> [2] https://lkml.org/lkml/2011/12/15/255
> [1] is already fixed - and is not the latest "problem" with this code.
> Fixing the problem in [1] actually itself created the latest problem
> with smp_call_function() which wasn't there before this change.  Patch
> [2] refers to this problem and proposes a fix for it.
>
>

Any idea if this patch if good to go in mainline? I am facing the same
issue on Tegra with 3.1.5 and this patch fixes the issue caused by [1].
Ideally if this change is ok, then it should be merged into 3.1.6 too.
Since this change seems to fix the issue at hand, I intend to merge the
change directly in our internal repo and wait for it to be a part of 3.1.6.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Is Pandaboard cpuhotplug working stably?
  2011-12-27  4:49           ` Varun Wadekar
@ 2012-01-03 17:21             ` Russell King - ARM Linux
  0 siblings, 0 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2012-01-03 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 27, 2011 at 10:19:32AM +0530, Varun Wadekar wrote:
> >>
> >> [1] https://lkml.org/lkml/2011/6/20/79
> >> [2] https://lkml.org/lkml/2011/12/15/255
> > [1] is already fixed - and is not the latest "problem" with this code.
> > Fixing the problem in [1] actually itself created the latest problem
> > with smp_call_function() which wasn't there before this change.  Patch
> > [2] refers to this problem and proposes a fix for it.
> 
> Any idea if this patch if good to go in mainline? I am facing the same
> issue on Tegra with 3.1.5 and this patch fixes the issue caused by [1].

Either we deem the fix for [1] to have caused a worse problem than it
fixed and we back it out, or the patch in [2] gets merged.

Which happens depends on the scheduler people; it can only be solved by
solving the nigh-on impossible-to-satisfy demands of the scheduler code
upon the CPU hotplug code.

I would suggest that [2] is a rather important patch which needs to be
merged before -final.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-01-03 17:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-21  9:23 Is Pandaboard cpuhotplug working stably? Barry Song
2011-12-21  9:46 ` Russell King - ARM Linux
2011-12-21  9:59   ` Barry Song
2011-12-21 10:07     ` Russell King - ARM Linux
2011-12-22  8:49       ` Shilimkar, Santosh
2011-12-22 10:24         ` Russell King - ARM Linux
2011-12-22 10:27           ` Shilimkar, Santosh
2011-12-27  4:49           ` Varun Wadekar
2012-01-03 17:21             ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).