linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: stepanm@codeaurora.org (Stepan Moskovchenko)
To: linux-arm-kernel@lists.infradead.org
Subject: [patch] ARM: smpboot: Enable interrupts after marking CPU online/active
Date: Tue, 15 Nov 2011 13:54:27 -0800	[thread overview]
Message-ID: <4EC2DF93.2050904@codeaurora.org> (raw)
In-Reply-To: <CAPz4a6C7md+k8pCz6mJtB-v_1h-faGxQZkOqU9vHUtcG-jrYVQ@mail.gmail.com>

On 11:59 AM, Dima Zavin wrote:
> <go go gadget message retractor>
>
> Looks like the issue ended up being the fact that the caller was
> accessing the cpu_online_mask without calling get_online_cpus anywhere
> in the path and could have picked up the the intermediate state where
> CPU was online but not yet active..
>
> --Dima
>
> On Wed, Oct 19, 2011 at 2:16 PM, Dima Zavin<dima@android.com>  wrote:
>> Thomas,
>>
>> I was seeing something similar, and this patch did address part of it,
>> but I still think we have a problem. What I am seeing is the
>> following:
>>
>> We have a SCHED_FIFO kernel thread which tries to do a blocking IPI
>> (smp_call_function_single) to all online cpus (for_each_online_cpu).
>> This introduces a deadlock where the FIFO thread could be preempting
>> the suspend thread which is the one that's trying to set the second
>> CPU active after it sees it going online. The second CPU marked itself
>> online and is then waiting for the first CPU to mark it active before
>> enabling irqs to service the IPI.
>>
>> It feels to me like we should probably not be sending blocking IPIs
>> from a FIFO thread for sanity reasons, but that does mean there's
>> still a deadlock present here.
>>
>> Thoughts?
>>
>> Thanks in advance.
>>
>> --Dima
>>

Hello

I am seeing a deadlock when executing hotplug operations with this patch
applied. When the secondary CPU gets brought up in _cpu_up, the cpu is 
turned on
and then the online notifier gets called, which is what marks the 
secondary CPU
as active. If _cpu_up on the primary CPU is preempted before the 
secondary CPU
is marked active, it is possible that the primary CPU will want to call
smp_call_function (or send an IPI) to the secondary CPU because it is marked
online. However, with this patch, the secondary CPU is still spinning on
!cpu_active(cpu)
with interrupts disabled. So, the primary CPU is now stuck in 
csd_lock_wait(),
waiting for the secondary CPU to respond, while the secondary CPU spins with
interrupts disabled, waiting for the primary CPU to mark it as active. 
So, while
your approach to not call smp_function_single may work for you in your 
specific
case, I believe there is still a problem in the general case.

One suggestion for resolving this might be making smp_call_function look 
at the
active CPUs rather than online CPUs, or to just let the secondary CPU mark
itself as active rather than having the primary CPU do this, though this 
might
defeat the original intended purpose of the active mask.

Steve

  reply	other threads:[~2011-11-15 21:54 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-08 21:57 [patch] ARM: smpboot: Enable interrupts after marking CPU online/active Thomas Gleixner
2011-09-09  4:17 ` Santosh
2011-09-13 13:30 ` amit kachhap
2011-09-13 13:32   ` Russell King - ARM Linux
2011-09-13 17:22     ` Vincent Guittot
2011-09-13 17:53       ` Russell King - ARM Linux
2011-09-13 20:48         ` Thomas Gleixner
2011-09-13 22:37           ` Russell King - ARM Linux
2011-09-14  1:10         ` Frank Rowand
2011-09-14  6:55           ` Vincent Guittot
2011-09-23  8:40         ` Russell King - ARM Linux
2011-09-26  7:26           ` Amit Kachhap
2011-09-29  7:40           ` Kukjin Kim
2011-09-29 20:12             ` Thomas Gleixner
2011-09-30  6:42               ` Kukjin Kim
2011-10-07  9:49           ` Kukjin Kim
2011-10-07 12:17             ` Thomas Gleixner
2011-10-07 14:09               ` Amit Kachhap
2011-10-10  4:28               ` Kukjin Kim
2011-10-19 21:16               ` Dima Zavin
2011-10-20  0:32                 ` Dima Zavin
2011-11-15 21:54                   ` Stepan Moskovchenko [this message]
2011-11-15 22:00                     ` Thomas Gleixner
2011-12-14  0:13                       ` Dima Zavin
2011-12-14  0:26                         ` Thomas Gleixner
2011-12-15 16:09                           ` Peter Zijlstra
2011-11-15 23:27                     ` Dima Zavin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EC2DF93.2050904@codeaurora.org \
    --to=stepanm@codeaurora.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).