From: Daniel J Blueman <daniel@numascale.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Hillf Danton <dhillf@gmail.com>, Borislav Petkov <bp@amd64.org>,
Ingo Molnar <mingo@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Steffen Persvold <sp@numascale.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [3.14] core onlining/hotplug regression
Date: Fri, 25 Jul 2014 17:36:51 +0800 [thread overview]
Message-ID: <53D22533.9030401@numascale.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1407251059130.23352@nanos>
On 07/25/2014 05:05 PM, Thomas Gleixner wrote:
> On Fri, 25 Jul 2014, Daniel J Blueman wrote:
>> On a larger x86 system with 1728 cores, 3.15(.6) asserts on
>> smpboot_thread_fn's td->cpu != smp_processor_id() consistently after ~1500
>> cores are online.
>>
>> Reverting the only directly related changes I could find [1,2] doesn't help.
>> Debugging indicates there is a race where the created thread is quickly
>> migrated to core 0 when this occurs, since smp_processor_id returns 0 in these
>> cases. Thomas introduced a thread parked state to fix related issues a year
>> back. Linux 3.14(.13) boots just nice.
>
> Weird. Commits [1,2] are definitely not the culprits.
>
>> Full boot output is at:
>> https://resources.numascale.com/linux-315-thread-mig.txt
>
> Not really helpful, as we don't see what causes it. We just see the
> wreckage.
>
>> Any theories so far? I'll start bisecting when I have full access to the
>> system again in a week and I'll do some more debugging with intermittent
>> access before then.
>
> One thing you could try is enabling tracing.
>
> "ftrace=function ftrace_dump_on_oops"
>
> It'll take a looooong time to spill out the traces, but that should
> give us the root cause precisely.
Good trick. I'll get this early next week and we'll see what's up.
Thanks,
Daniel
--
Daniel J Blueman
Principal Software Engineer, Numascale
next prev parent reply other threads:[~2014-07-25 9:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-25 7:50 [3.14] core onlining/hotplug regression Daniel J Blueman
2014-07-25 9:05 ` Thomas Gleixner
2014-07-25 9:36 ` Daniel J Blueman [this message]
2014-09-13 9:03 ` Daniel J Blueman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D22533.9030401@numascale.com \
--to=daniel@numascale.com \
--cc=bp@amd64.org \
--cc=dhillf@gmail.com \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=sp@numascale.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.