All of lore.kernel.org
 help / color / mirror / Atom feed
From: santosh.shilimkar@ti.com (Santosh)
To: linux-arm-kernel@lists.infradead.org
Subject: [patch] ARM: smpboot: Enable interrupts after marking CPU online/active
Date: Fri, 09 Sep 2011 09:47:07 +0530	[thread overview]
Message-ID: <4E699343.7030505@ti.com> (raw)
In-Reply-To: <20110908215314.829452535@linutronix.de>

On Friday 09 September 2011 03:27 AM, Thomas Gleixner wrote:
> Frank Rowand reported:
>
>   I have a consistent (every boot) hang on boot with the RT patches.
>   With a few hacks to get console output, I get:
>
>     rcu_preempt_state detected stalls on CPUs/tasks
>
>   I have also replicated the problem on the ARM RealView (in tree) and
>   without the RT patches.
>
>   The problem ended up being caused by the allowed cpus mask being set
>   to all possible cpus for the ksoftirqd on the secondary processors.
>   So the RCU softirq was never executing on the secondary cpu.
>
>   The problem was that ksoftirqd was woken on the secondary processors before
>   the secondary processors were online. This led to allowed cpus being set
>   to all cpus.
>
>      wake_up_process()
>         try_to_wake_up()
>            select_task_rq()
>               if (... || !cpu_online(cpu))
>                  select_fallback_rq(task_cpu(p), p)
>                     ...
>                     /* No more Mr. Nice Guy. */
>                     dest_cpu = cpuset_cpus_allowed_fallback(p)
>                        do_set_cpus_allowed(p, cpu_possible_mask)
>                           #  Thus ksoftirqd can now run on any cpu...
> </report>
>
> The reason is that the ARM SMP boot code for the secondary CPUs enables
> interrupts before the newly brought up CPU is marked online and
> active.
>
> That causes a wakeup of ksoftirqd or a wakeup of any other kernel
> thread which is affine to the brought up CPU break that threads
> affinity and therefor being scheduled on already online CPUs.
>
> This problem has been observed on x86 before and the only solution is
> to mark the CPU online and wait for the CPU active bit before the
> point where interrupts are enabled.
>
> This is safe as the percpu timer setup and the calibration code are
> not part of the critical setup path and the calibration code needs to
> have interrupts enabled anyway. We cannot schedule away at this point
> because we are still in the preempt disabled region which is released
> in cpu_idle().
>
> Reported-and-tested-by: Frank Rowand<frank.rowand@am.sony.com>
> Link:http://lkml.kernel.org/r/alpine.LFD.2.02.1109071115410.2723 at ionos
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>

A while back, while debugging a CPU ONLINE issue, I cooked up the
similar patch based on the above race condition.

https://lkml.org/lkml/2011/6/20/79

But the issue I was facing was slightly different and that got sorted
out with fixing the re-calibration code.

Good to see that we have a test case which proves the race conditions,
I was describing.

Regards
Santosh

  reply	other threads:[~2011-09-09  4:17 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-08 21:57 [patch] ARM: smpboot: Enable interrupts after marking CPU online/active Thomas Gleixner
2011-09-09  4:17 ` Santosh [this message]
2011-09-13 13:30 ` amit kachhap
2011-09-13 13:32   ` Russell King - ARM Linux
2011-09-13 17:22     ` Vincent Guittot
2011-09-13 17:53       ` Russell King - ARM Linux
2011-09-13 20:48         ` Thomas Gleixner
2011-09-13 22:37           ` Russell King - ARM Linux
2011-09-14  1:10         ` Frank Rowand
2011-09-14  6:55           ` Vincent Guittot
2011-09-23  8:40         ` Russell King - ARM Linux
2011-09-26  7:26           ` Amit Kachhap
2011-09-29  7:40           ` Kukjin Kim
2011-09-29 20:12             ` Thomas Gleixner
2011-09-30  6:42               ` Kukjin Kim
2011-10-07  9:49           ` Kukjin Kim
2011-10-07 12:17             ` Thomas Gleixner
2011-10-07 14:09               ` Amit Kachhap
2011-10-10  4:28               ` Kukjin Kim
2011-10-19 21:16               ` Dima Zavin
2011-10-20  0:32                 ` Dima Zavin
2011-11-15 21:54                   ` Stepan Moskovchenko
2011-11-15 22:00                     ` Thomas Gleixner
2011-11-15 22:00                       ` Thomas Gleixner
2011-12-14  0:13                       ` Dima Zavin
2011-12-14  0:13                         ` Dima Zavin
2011-12-14  0:26                         ` Thomas Gleixner
2011-12-14  0:26                           ` Thomas Gleixner
2011-12-15 16:09                           ` Peter Zijlstra
2011-12-15 16:09                             ` Peter Zijlstra
2012-03-13  4:45                             ` [tip:sched/core] sched: Cleanup cpu_active madness tip-bot for Peter Zijlstra
2011-11-15 23:27                     ` [patch] ARM: smpboot: Enable interrupts after marking CPU online/active Dima Zavin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E699343.7030505@ti.com \
    --to=santosh.shilimkar@ti.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.