linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: santosh.shilimkar@ti.com (Santosh)
To: linux-arm-kernel@lists.infradead.org
Subject: [patch] ARM: smpboot: Enable interrupts after marking CPU online/active
Date: Fri, 09 Sep 2011 09:47:07 +0530	[thread overview]
Message-ID: <4E699343.7030505@ti.com> (raw)
In-Reply-To: <20110908215314.829452535@linutronix.de>

On Friday 09 September 2011 03:27 AM, Thomas Gleixner wrote:
> Frank Rowand reported:
>
>   I have a consistent (every boot) hang on boot with the RT patches.
>   With a few hacks to get console output, I get:
>
>     rcu_preempt_state detected stalls on CPUs/tasks
>
>   I have also replicated the problem on the ARM RealView (in tree) and
>   without the RT patches.
>
>   The problem ended up being caused by the allowed cpus mask being set
>   to all possible cpus for the ksoftirqd on the secondary processors.
>   So the RCU softirq was never executing on the secondary cpu.
>
>   The problem was that ksoftirqd was woken on the secondary processors before
>   the secondary processors were online. This led to allowed cpus being set
>   to all cpus.
>
>      wake_up_process()
>         try_to_wake_up()
>            select_task_rq()
>               if (... || !cpu_online(cpu))
>                  select_fallback_rq(task_cpu(p), p)
>                     ...
>                     /* No more Mr. Nice Guy. */
>                     dest_cpu = cpuset_cpus_allowed_fallback(p)
>                        do_set_cpus_allowed(p, cpu_possible_mask)
>                           #  Thus ksoftirqd can now run on any cpu...
> </report>
>
> The reason is that the ARM SMP boot code for the secondary CPUs enables
> interrupts before the newly brought up CPU is marked online and
> active.
>
> That causes a wakeup of ksoftirqd or a wakeup of any other kernel
> thread which is affine to the brought up CPU break that threads
> affinity and therefor being scheduled on already online CPUs.
>
> This problem has been observed on x86 before and the only solution is
> to mark the CPU online and wait for the CPU active bit before the
> point where interrupts are enabled.
>
> This is safe as the percpu timer setup and the calibration code are
> not part of the critical setup path and the calibration code needs to
> have interrupts enabled anyway. We cannot schedule away at this point
> because we are still in the preempt disabled region which is released
> in cpu_idle().
>
> Reported-and-tested-by: Frank Rowand<frank.rowand@am.sony.com>
> Link:http://lkml.kernel.org/r/alpine.LFD.2.02.1109071115410.2723 at ionos
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>

A while back, while debugging a CPU ONLINE issue, I cooked up the
similar patch based on the above race condition.

https://lkml.org/lkml/2011/6/20/79

But the issue I was facing was slightly different and that got sorted
out with fixing the re-calibration code.

Good to see that we have a test case which proves the race conditions,
I was describing.

Regards
Santosh

  reply	other threads:[~2011-09-09  4:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-08 21:57 [patch] ARM: smpboot: Enable interrupts after marking CPU online/active Thomas Gleixner
2011-09-09  4:17 ` Santosh [this message]
2011-09-13 13:30 ` amit kachhap
2011-09-13 13:32   ` Russell King - ARM Linux
2011-09-13 17:22     ` Vincent Guittot
2011-09-13 17:53       ` Russell King - ARM Linux
2011-09-13 20:48         ` Thomas Gleixner
2011-09-13 22:37           ` Russell King - ARM Linux
2011-09-14  1:10         ` Frank Rowand
2011-09-14  6:55           ` Vincent Guittot
2011-09-23  8:40         ` Russell King - ARM Linux
2011-09-26  7:26           ` Amit Kachhap
2011-09-29  7:40           ` Kukjin Kim
2011-09-29 20:12             ` Thomas Gleixner
2011-09-30  6:42               ` Kukjin Kim
2011-10-07  9:49           ` Kukjin Kim
2011-10-07 12:17             ` Thomas Gleixner
2011-10-07 14:09               ` Amit Kachhap
2011-10-10  4:28               ` Kukjin Kim
2011-10-19 21:16               ` Dima Zavin
2011-10-20  0:32                 ` Dima Zavin
2011-11-15 21:54                   ` Stepan Moskovchenko
2011-11-15 22:00                     ` Thomas Gleixner
2011-12-14  0:13                       ` Dima Zavin
2011-12-14  0:26                         ` Thomas Gleixner
2011-12-15 16:09                           ` Peter Zijlstra
2011-11-15 23:27                     ` Dima Zavin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E699343.7030505@ti.com \
    --to=santosh.shilimkar@ti.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).