public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: linux-ia64@vger.kernel.org
Subject: RE: hardware error state at cmc
Date: Mon, 08 Dec 2003 18:23:23 +0000	[thread overview]
Message-ID: <marc-linux-ia64-107090799908270@msgid-missing> (raw)
In-Reply-To: <marc-linux-ia64-107090003329219@msgid-missing>

> Hello,
> could anyone give me a hint about the meaning of the following
> appearing in system.log (i2000, 2.6.0-test4, uptime ~40 days):
> 
> kernel: +BEGIN HARDWARE ERROR STATE AT CMC
> kernel: +Err Record ID: 37    SAL Rev:  0.02
> kernel: +Time: 12/03/2003 18:56:34    Severity 258
> kernel: +Processor Device Error Info Section
> kernel:  Processor Error Map: 0x4000
> kernel:  Processor State Param: 0x8000000fff611b0
> kernel:  Processor LID: 0x3000000
> kernel: + Cache check info[0]
> kernel: +  Level: L0, Index: 0, Operation: Unknown,
> kernel:  CPUID Regs: 0x49656e69756e6547 0x6c65746e 0x0 0x7000804
> kernel: +END HARDWARE ERROR STATE AT CMC

One of your processors had a correctible error in its cache. The
cpu fixed it, but interrupted the OS to tell you it that it happened.

The "Processor LID" field should tell you which cpu had the error
(should match the "cr.lid" value of one of you cpus).  This is
probably the 37th error since system reset (Error Record ID is
37).  You might want to check your logs to see what kinds of errors
were reported for the previous 36 errors to see if there is any sort
of pattern (which may indicate real hardware problems).  If the
errors are of different types, and reported by different processors,
then you may just be seeing stray neutrons flipping bits as they
pass through.

You might also want to get 2.6.0-test11 and apply Keith Owens patch
http://marc.theaimsgroup.com/?l=linux-ia64&m\x106974968032730&w=2 to
get easier to read logs, together with Keith's "salinfo" package,
which Bjorn hosted at:
ftp://ftp.kernel.org/pub/linux/kernel/people/helgaas/salinfo-0.4.tar.gz

-Tony Luck

      reply	other threads:[~2003-12-08 18:23 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-08 16:05 hardware error state at cmc Christian Hinkelbein
2003-12-08 18:23 ` Luck, Tony [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=marc-linux-ia64-107090799908270@msgid-missing \
    --to=tony.luck@intel.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox