linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC] Make SMP secondary CPU up more resilient to failure.
Date: Thu, 16 Dec 2010 11:34:07 +0000	[thread overview]
Message-ID: <20101216113407.GO9937@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <AANLkTinM524ozYQhHSpCN49LvZLePO6faHJrqxKBtyJe@mail.gmail.com>

On Wed, Dec 15, 2010 at 05:45:13PM -0600, Andrei Warkentin wrote:
> This is my first time on linux-arm-kernel, and while I've read the
> FAQ, hopefully I don't screw up too badly :).
> 
> Anyway, we're on a dual-core ARMv7 running 2.6.36, and during
> stability stress testing saw the following:
> 1) After a number hotplug iterations, CPU1 fails to set its online bit
> quickly enough and __cpu_up() times-out.
> 2) CPU1 eventually completes its startup and sets the bit, however,
> since _cpu_up() failed, CPU1's active bit is never set.

Why is your CPU taking soo long to come up?  We wait one second in the
generic code, which is the time taken from the platform code being happy
that it has successfully started the CPU.  Normally, platforms wait an
additional second to detect the CPU entering the kernel.

> 2) Additionally I ensure that if the CPU comes up later than it were
> supposed to (shouldn't, but...), then it will not start initializing
> behind cpu_up's back (which is not really undoable). This solves the
> problem with both cpu_up+secondary_start_kernel races and with
> platform_cpu_kill+secondary_start_kernel races.

Why would you have platform_cpu_kill() running at the same time - firstly,
hotplug events are serialized, and secondly the platform_cpu_kill() path
should wait up to five seconds for the CPU to go offline.  If it doesn't
go offline within five seconds it's dead (and maybe we should mark it
not present.)

  reply	other threads:[~2010-12-16 11:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-15 23:45 [RFC] Make SMP secondary CPU up more resilient to failure Andrei Warkentin
2010-12-16 11:34 ` Russell King - ARM Linux [this message]
2010-12-16 23:09   ` Andrei Warkentin
2010-12-16 23:28     ` Russell King - ARM Linux
2010-12-17 20:52       ` Andrei Warkentin
2010-12-17 23:14         ` Russell King - ARM Linux
2010-12-17 23:45           ` Andrei Warkentin
2010-12-18  0:08             ` Russell King - ARM Linux
2010-12-18  0:36               ` Russell King - ARM Linux
2010-12-18  7:17               ` Andrei Warkentin
2010-12-18 12:01                 ` Russell King - ARM Linux
2010-12-18 12:10                   ` Andrei Warkentin
2010-12-18 20:04                     ` Russell King - ARM Linux
2010-12-21 21:53                       ` Andrei Warkentin
2010-12-24 17:38                         ` Russell King - ARM Linux
2011-01-13 10:19                           ` Andrei Warkentin
2011-01-13 11:14                             ` Russell King - ARM Linux
2011-01-13 22:03                               ` Andrei Warkentin
2010-12-17  0:11     ` murali at embeddedwireless.com
2010-12-18  9:58     ` Russell King - ARM Linux
2010-12-18 11:54       ` Andrei Warkentin
2010-12-18 12:19         ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101216113407.GO9937@n2100.arm.linux.org.uk \
    --to=linux@arm.linux.org.uk \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).