From: Ingo Molnar <mingo@elte.hu>
To: Jes Sorensen <jes@sgi.com>, Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>, Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Latest Linus tree oopses on Nehalem box
Date: Fri, 21 Aug 2009 13:46:45 +0200 [thread overview]
Message-ID: <20090821114645.GD24647@elte.hu> (raw)
In-Reply-To: <4A8E7CBE.3020209@sgi.com>
* Jes Sorensen <jes@sgi.com> wrote:
> Hi,
>
> I am seeing this one with the latest Linus' git tree as of this
> morning on a Nehalem box. Using the defconfig + megaraid driver.
>
> Not sure if this is already fixed, or if someone already knows
> whats wrong? Smells like a yet another BIOS bug - yes the BIOS on
> this thing is rubbish.
my Nehalem (16 logical cpus) boots fine:
aldebaran:~> uname -a
Linux aldebaran 2.6.31-rc6-tip-01272-g9919e28-dirty #1518 SMP Fri
Aug 21 11:13:12 CEST 2009 x86_64 x86_64 x86_64 GNU/Linux
> [ 6.664800] RIP: 0010:[<ffffffff810391e7>] [<ffffffff810391e7>]
> find_busiest_group+0x620/0x6fd
Nothing similar is open at the moment.
There's only one open .31 scheduler regression bug at the moment: a
rare division by zero bug that sometimes crashes boxes - the bigger
the box the likelier the crash.
Your crash looks to be one of:
1) a genuine scheduler bug tickled on your new hardware. Needs to
be bisected/debugged/fixed.
2) a BIOS bug passing crappy ACPI tables which cause us to create a
buggy sched-domains tree or so. We do treat ACPI data as
external untrusted data and try to use it in sane ways only, but
such bugs have happened in the past and could happen again.
The scheduler has sanity check for the sched-domains arch setup: if
you enable CONFIG_SCHED_DEBUG=y then sched_domain_debug() will
become noisy in your syslog if there's something wrong (but wont
stop the bootup so you have to actively check your syslog).
Might be useful to see your full crashlog, if you are allowed to
post that, plus your kernel .config would be useful to know too.
Plus would be useful to know whether this is a regression relative
to .30 or a yet unfixed bug triggering on your class of hardware.
Thanks,
Ingo
next prev parent reply other threads:[~2009-08-21 11:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-21 10:53 Latest Linus tree oopses on Nehalem box Jes Sorensen
2009-08-21 11:46 ` Ingo Molnar [this message]
2009-08-21 11:58 ` Peter Zijlstra
2009-08-21 14:42 ` [tip:sched/core] sched: Avoid division by zero tip-bot for Peter Zijlstra
2009-08-25 19:11 ` Peter Zijlstra
2009-08-26 9:16 ` Yinghai Lu
2009-08-26 9:25 ` Peter Zijlstra
2009-08-27 11:08 ` [PATCH] sched: Avoid division by zero - really Peter Zijlstra
2009-08-27 12:19 ` Eric Dumazet
2009-08-27 12:32 ` Peter Zijlstra
2009-08-28 6:30 ` [tip:sched/core] sched: Fix " tip-bot for Peter Zijlstra
2009-08-21 13:04 ` Latest Linus tree oopses on Nehalem box Jes Sorensen
2009-08-21 13:26 ` Ingo Molnar
2009-08-21 13:35 ` Jes Sorensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090821114645.GD24647@elte.hu \
--to=mingo@elte.hu \
--cc=a.p.zijlstra@chello.nl \
--cc=hpa@zytor.com \
--cc=jens.axboe@oracle.com \
--cc=jes@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox