public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: AndrewL733 <AndrewL733@aol.com>
To: linux-kernel <linux-kernel@vger.kernel.org>
Subject: NMI error and Intel S5000PSL Motherboards
Date: Wed, 26 Sep 2007 02:12:34 -0800	[thread overview]
Message-ID: <46FA3092.70108@aol.com> (raw)

We have about 100 servers based on Intel S5000PSL-SATA motherboards. 
They have been running for anywhere between 1 and 10 months. For the 
past few months, after updating them all to the 2.6.20.15 kernel 
(because of a bug in the 2.6.18 kernel), we are seeing some strange NMI 
errors. For example:

Aug 29 09:02:10 master kernel: Uhhuh. NMI received for unknown reason 30.
Aug 29 09:02:10 master kernel: Do you have a strange power saving mode 
enabled?
Aug 29 09:02:10 master kernel: Dazed and confused, but trying to continue

Sometimes these errors cause a total system freeze. Most of the time the 
systems keep running.

We have determined these errors come most frequently on machines that 
have an Intel PCI-e Quad Port Gigabit Adapter. On machines that HAVE 
these cards (it doesn't matter what slot they are in), the NMI errors 
can occur as frequently as every 3-5 minutes. On machines that do NOT 
have these Quad Port Adapters, the NMI errors occur about once per month 
on average. (we have tried the "in-kernel" e1000 drivers, as well as 
Intel's latest - 7.6.5).

We have also determined (through a chance discovery) that running 
“scanpci” can 100 percent reliably reproduce the NMI error on any 
machine that has the Quad Port NICS. Our various motherboards have 
different Intel BIOS versions – some have Rev 70, others 74, 79 or 81. 
They all exhibit the same behavior regardless of BIOS version.

We have reproduced this problem with:

Mandriva 2008 RC2 (2.6.22 kernel)
Mandriva 2007 with custom 2.6.20.15 kernel
Mandriva 2007 with custom 2.6.19.8 kernel
Ubuntu “Feisty” with 2.6.20 kernel
Fedora Core 7 with 2.6.22 kernel

The problem does NOT occur with any distribution running a 2.6.18 kernel 
or lower. I.E., CentOS or SUSE 10 and also Mandriva 2007 with included 
2.6.17 kernel or custom-compiled 2.6.18 kernel.

We have been in contact with Intel. Their high level tech support people 
have basically said,

    “the errors we have logged so far are pointing to a kernel issue and
    not a hardware problem. If we [Intel] can confirm this, it will be
    up to the kernel developer or OS system manufacturer to debug those
    ones, as we do not perform Operating system support.”

In other words, Intel seems to be blaming the problem we are seeing on 
something introduced starting with the 2.6.19 kernel. We are not looking 
to blame anybody. We are only looking for a solution.

Does anybody have an idea what could be going on here, as well as what 
the solution may be? Going back to 2.6.18 or lower is not an option.



             reply	other threads:[~2007-09-26  2:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-26 10:12 AndrewL733 [this message]
2007-09-26  2:59 ` NMI error and Intel S5000PSL Motherboards Randy Dunlap
2007-09-26  4:58 ` Randy Dunlap
2007-09-26 11:16 ` Alan Cox
2007-09-26 23:48 ` Jim Paris
2007-09-27  0:03   ` Randy Dunlap
2007-09-28 15:11     ` AndrewL733
2007-09-28 15:13     ` AndrewL733
2007-10-01  4:09       ` Repost: " AndrewL733
  -- strict thread matches above, loose matches on Subject: below --
2007-09-26 19:07 [Re: NMI error and Intel S5000PSL Motherboards] samson yeung
2007-09-26 20:52 ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46FA3092.70108@aol.com \
    --to=andrewl733@aol.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox