public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [2.6.1 MCE falseness?] Hardware reports non-fatal error
@ 2004-01-18 13:30 Pedro Larroy
  2004-01-18 14:23 ` glee
  0 siblings, 1 reply; 7+ messages in thread
From: Pedro Larroy @ 2004-01-18 13:30 UTC (permalink / raw)
  To: linux-kernel

I also have been getting apparently false MCEs since 2.5.xx 
I even had kernel panics in early 2.5 with MCE enabled. Now in 2.6.0-xx
and in 2.6.1 I just get them from time to time but none fatal.
most of the time in CPU 0

request_module: failed /sbin/modprobe -- char-major-6-0. error = 256
MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0.
Bank 0: e606200000000833
request_module: failed /sbin/modprobe --

the box is dual athlon mp with AMD 760MP chipset.


nebula:/home/piotr# ./parsemce b 1 -a 0 -e e606200000000833
Status: (e606200000000833) Error IP valid
Restart IP valid.
nebula:/home/piotr#




-- 
  Pedro Larroy Tovar  |  piotr%member.fsf.org 

Software patents are a threat to innovation in Europe please check: 
	http://www.eurolinux.org/     


^ permalink raw reply	[flat|nested] 7+ messages in thread
* RE: [2.6.1 MCE falseness?] Hardware reports non-fatal error
@ 2004-01-20 19:43 Niel Lambrechts
  0 siblings, 0 replies; 7+ messages in thread
From: Niel Lambrechts @ 2004-01-20 19:43 UTC (permalink / raw)
  To: linux-kernel

I tried the mentioned patch, with a modification for my CPU type, but
still get the problem:

"Jan 20 21:30:23 ksyrium kernel: MCE: The hardware reports a non fatal,
correctable incident occurred on CPU 0.
Jan 20 21:30:23 ksyrium kernel: MCE: startbank = 1, vendor : 0, x86 = 6,
model = 9, mask = 5.
Jan 20 21:30:23 ksyrium kernel: Bank 1: f200000000000185"

As you can see, I added a little extra debugging info. Here is the
relevant portion of the code:
" if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD && boot_cpu_data.x86
== 6) || (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
boot_cpu_data.x86 == 6 && boot_cpu_data.x86_model == 9 &&
boot_cpu_data.x86_mask == 5))

startbank = 1;"

Comments would be appreciated.

-Niel




^ permalink raw reply	[flat|nested] 7+ messages in thread
* [2.6.1 MCE falseness?] Hardware reports non-fatal error
@ 2004-01-18  1:44 Niel Lambrechts
  2004-01-18  2:03 ` Dave Jones
  0 siblings, 1 reply; 7+ messages in thread
From: Niel Lambrechts @ 2004-01-18  1:44 UTC (permalink / raw)
  To: linux-kernel


I get the following problem with 2.6.1 consistently after apm resuming:

"ksyrium kernel: MCE: The hardware reports a non fatal, correctable
incident occurred on CPU 0.

Message from syslogd@ksyrium at Wed Jan 14 13:33:06 2004 ...
ksyrium kernel: Bank 1: f2000000000001c5"

It does not happen on any other kernels I use (vanilla 2.4.24, SuSE 9
2.4.21-166) - even though CONFIG_X86_MCE=y for both. The equipment is
brand-new - an IBM Thinkpad R50P - and it passes all IBM's s/w
diagnostic.

I'd appreciate help with the parameters for parsemce to interpret the
problem...not sure if my usage is correct? ;)

# ./parsemce -b 1 -a 0 -e f2000000000001c5
Status: (f2000000000001c5) Machine Check in progress.
Restart IP valid.

Is this really hardware (maybe a bug in  the BIOS?) or are false
positives possible with 2.6 MCE code?

-Niel





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-01-21 12:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-18 13:30 [2.6.1 MCE falseness?] Hardware reports non-fatal error Pedro Larroy
2004-01-18 14:23 ` glee
2004-01-18 20:04   ` Dave Jones
  -- strict thread matches above, loose matches on Subject: below --
2004-01-20 19:43 Niel Lambrechts
2004-01-18  1:44 Niel Lambrechts
2004-01-18  2:03 ` Dave Jones
2004-01-21 12:13   ` Stephen Rothwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox