Re: offlining cpus breakage

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Michael Ellerman <michael@ellerman.id.au>,
	Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: "Shreyas B. Prabhu" <shreyas@linux.vnet.ibm.com>,
	Paul Mackerras <paulus@samba.org>,
	Anton Blanchard <anton@samba.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: offlining cpus breakage
Date: Sat, 17 Jan 2015 19:09:23 +0530	[thread overview]
Message-ID: <54BA660B.7050609@linux.vnet.ibm.com> (raw)
In-Reply-To: <1421377482.18166.5.camel@ellerman.id.au>

On 01/16/2015 08:34 AM, Michael Ellerman wrote:
> On Fri, 2015-01-16 at 13:28 +1300, Alexey Kardashevskiy wrote:
>> On 01/16/2015 02:22 AM, Preeti U Murthy wrote:
>>> Hi Alexey,
>>>
>>> Can you let me know if the following patch fixes the issue for you ?
>>> It did for us on one of our machines that we were investigating on.
>>
>> This fixes the issue for me as well, thanks!
>>
>> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>	
> 
> OK, that's great.
> 
> But, I really don't think we can ask upstream to merge this patch to generic
> code when we don't have a good explanation for why it's necessary. At least I'm
> not going to ask anyone to do that :)
> 
> So Pretti can you either write a 100% convincing explanation of why this patch
> is correct in the general case, or (preferably) do some more investigating to
> work out what Alexey's bug actually is.

On further investigation, I found that the issue lies in the latency of
cpu hotplug operation, specifically the time taken for the offline cpu
to enter powersave mode.

The time between the beginning of the cpu hotplug operation and the
beginning of __cpu_die() operation (which is one of the last stages of
cpu hotplug) takes around a maximum of 40ms. Although this is not
causing softlockups, it is quite a large duration.

The more serious issue is the time taken for __cpu_die() operation to
complete. The __cpu_die() operation waits for the offline cpu to set its
state to CPU_DEAD before entering powersave state. This time varies from
4s to a maximum of 200s! It is not this bad always but it does happen
quite a few times. It is during these times that we observe softlockups.
I added trace prints throughout the cpu hotplug code to measure these
numbers. This delay is causing the softlockup and here is why.

If the cpu going offline is the one broadcasting wakeups to cpus in
fastsleep, it queues the broadcast timer on another cpu during the
CPU_DEAD phase. The CPU_DEAD notifiers are run only after the
__cpu_die() operation completes, which is taking a long time as
mentioned above. So between the time irqs are migrated off the about to
go offline cpu and CPU_DEAD stage, no cpu can be woken up. The above
numbers show that this can be a horridly long time. Hence the next time
that they get woken up the unnatural idle time is detected and
softlockup triggers.

The patch on this thread that I proposed covered up the problem by
allowing the remaining cpus to freshly reevaluate their wakeups after
the stop machine phase without having to depend on the previous
broadcast state.So it did not matter what the previously appointed
broadcast cpu was upto.However there are still corner cases which cannot
get solved with this patch. And understandably because it is not
addressing the core issue, which is how to get around the latency issue
of cpu hotplug.

There can be ways in which the broadcast timer be migrated in time
during hotplug to get around the softlockups, but the latency of the cpu
hotplug operation looks like a serious issue. Has anybody observed or
explicitly instrumented cpu hotplug operation before and happened to
notice the large time duration required for its completion?

Ccing Paul.

Thanks

Regards
Preeti U Murthy
> 
> cheers
> 
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>

next prev parent reply	other threads:[~2015-01-17 13:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-07  9:37 offlining cpus breakage Alexey Kardashevskiy
2015-01-14  4:20 ` Shreyas B Prabhu
2015-01-14 11:03 ` Shreyas B Prabhu
2015-01-15 13:22 ` Preeti U Murthy
2015-01-16  0:28   ` Alexey Kardashevskiy
2015-01-16  3:04     ` Michael Ellerman
2015-01-16  8:56       ` Preeti U Murthy
2015-01-16  9:10         ` Preeti U Murthy
2015-01-22  5:29           ` Michael Ellerman
2015-01-22  6:31             ` Preeti U Murthy
2015-01-17 13:39       ` Preeti U Murthy [this message]
2015-01-18 16:50         ` Preeti U Murthy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54BA660B.7050609@linux.vnet.ibm.com \
    --to=preeti@linux.vnet.ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=anton@samba.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=shreyas@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).