public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel J Blueman <daniel@numascale.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Hillf Danton <dhillf@gmail.com>, Borislav Petkov <bp@amd64.org>,
	Ingo Molnar <mingo@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Steffen Persvold <sp@numascale.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [3.14] core onlining/hotplug regression
Date: Fri, 25 Jul 2014 17:36:51 +0800	[thread overview]
Message-ID: <53D22533.9030401@numascale.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1407251059130.23352@nanos>

On 07/25/2014 05:05 PM, Thomas Gleixner wrote:
> On Fri, 25 Jul 2014, Daniel J Blueman wrote:
>> On a larger x86 system with 1728 cores, 3.15(.6) asserts on
>> smpboot_thread_fn's td->cpu != smp_processor_id() consistently after ~1500
>> cores are online.
>>
>> Reverting the only directly related changes I could find [1,2] doesn't help.
>> Debugging indicates there is a race where the created thread is quickly
>> migrated to core 0 when this occurs, since smp_processor_id returns 0 in these
>> cases. Thomas introduced a thread parked state to fix related issues a year
>> back. Linux 3.14(.13) boots just nice.
>
> Weird. Commits [1,2] are definitely not the culprits.
>
>> Full boot output is at:
>> https://resources.numascale.com/linux-315-thread-mig.txt
>
> Not really helpful, as we don't see what causes it. We just see the
> wreckage.
>
>> Any theories so far? I'll start bisecting when I have full access to the
>> system again in a week and I'll do some more debugging with intermittent
>> access before then.
>
> One thing you could try is enabling tracing.
>
>      "ftrace=function ftrace_dump_on_oops"
>
> It'll take a looooong time to spill out the traces, but that should
> give us the root cause precisely.

Good trick. I'll get this early next week and we'll see what's up.

Thanks,
   Daniel
-- 
Daniel J Blueman
Principal Software Engineer, Numascale

  reply	other threads:[~2014-07-25  9:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-25  7:50 [3.14] core onlining/hotplug regression Daniel J Blueman
2014-07-25  9:05 ` Thomas Gleixner
2014-07-25  9:36   ` Daniel J Blueman [this message]
2014-09-13  9:03   ` Daniel J Blueman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53D22533.9030401@numascale.com \
    --to=daniel@numascale.com \
    --cc=bp@amd64.org \
    --cc=dhillf@gmail.com \
    --cc=imammedo@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sp@numascale.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox