From: Robin Holt <holt@sgi.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Robin Holt <holt@sgi.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
linux-kernel@vger.kernel.org
Subject: Re: Commit cb83b62 fails to boot with a divide by zero error.
Date: Mon, 14 May 2012 07:40:05 -0500 [thread overview]
Message-ID: <20120514124005.GL3751@sgi.com> (raw)
In-Reply-To: <20120514104829.GA25923@gmail.com>
On Mon, May 14, 2012 at 12:48:29PM +0200, Ingo Molnar wrote:
>
> * Robin Holt <holt@sgi.com> wrote:
>
> > On Fri, May 11, 2012 at 05:36:13PM +0200, Peter Zijlstra wrote:
> > > On Fri, 2012-05-11 at 10:05 -0500, Robin Holt wrote:
> > > > On Fri, May 11, 2012 at 04:33:10PM +0200, Peter Zijlstra wrote:
> > > > > On Fri, 2012-05-11 at 08:39 -0500, Robin Holt wrote:
> > > > >
> > > > > > We found that reverting the commit:
> > > > > > cb83b62 (x86/sched/core) sched/numa: Rewrite the CONFIG_NUMA sched domain support
> > > > > >
> > > > > > also got things working.
> > > > >
> > > > > there's a particularly stupid bug in that code
> > > >
> > > > Even with that applied, I still get the divide by zero.
> > >
> > > Humm.. what kind of machine is this? And how far along does it get in
> > > booting? ->power isn't supposed to get to 0.
> >
> > It is a four blade (8 socket 80 core 160 hyper-thread machine)
> > with 40 GB of RAM.
> >
> > Looking at the earlier kernel messages, I am wondering if I
> > don't have a BIOS that is giving me crud. I have messages
> > about hyperthreads being on different nodes. That had not
> > been happening in the past. I don't have access to the
> > machine now, but the BIOS string that had printed out is from
> > a developer's debug version.
> >
> > When I get access to the machine again (likely not until
> > Monday), I will flash a release BIOS and retest. Until then,
> > please feel free to ignore me.
>
> Please don't re-flash the BIOS! We want to fix this bug - the
> kernel should never crash on whatever topology data the BIOS
> passes.
>
> We can sanitize it or ignore it, but crashing is not an option.
> So lets figure this out, ok?
I have the old BIOS as well so I can flash back. Plus, I have the
BIOS developer's description of his changes and he has saved his
workarea. Toggling back and forth should not be a problem to help
us determine the source and "correct" fix.
Robin
prev parent reply other threads:[~2012-05-14 12:40 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-11 13:39 Commit cb83b62 fails to boot with a divide by zero error Robin Holt
2012-05-11 14:33 ` Peter Zijlstra
2012-05-11 15:05 ` Robin Holt
2012-05-11 15:36 ` Peter Zijlstra
2012-05-11 15:55 ` Robin Holt
2012-05-11 16:01 ` Peter Zijlstra
2012-05-14 10:48 ` Ingo Molnar
2012-05-14 12:40 ` Robin Holt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120514124005.GL3751@sgi.com \
--to=holt@sgi.com \
--cc=a.p.zijlstra@chello.nl \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox