public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tony Vroon <tony@vroon.org>
To: Felix von Leitner <felix-linuxkernel@fefe.de>
Cc: Linux Kernel Mailing list <linux-kernel@vger.kernel.org>
Subject: Re: MCEs
Date: Fri, 24 Oct 2008 17:23:18 +0100	[thread overview]
Message-ID: <1224865398.9632.22.camel@localhost> (raw)
In-Reply-To: <20081024124502.GA9425@codeblau.de>

[-- Attachment #1: Type: text/plain, Size: 1257 bytes --]

On Fri, 2008-10-24 at 14:45 +0200, Felix von Leitner wrote:
> Now the most common causes for MCEs are apparently heat issues and bad
> memory.  I can rule out both.

Are you sure? I have had MCEs and instability for a while now, and using
mcelog --k8 --dmi /dev/mcelog

I finally got a clear "this component is at fault" message, pinpointing
DIMM 4 on CPU 2. I shuffled the DIMMs around and then used the machine
again.
The message shifted with the DIMM, to DIMM 1 on CPU 1. Memtest86+
doesn't appear to stress the hardware enough to provoke single or
multi-bit errors though. (So, a few successful passes in memtest86+ does
not rule out a RAM problem)

Temperatures can also get high at locations in the machine that have no
sensors (specifically voltage regulators). To check for heat problems
you could operate your tower case whilst lying on the floor, so hot air
rises up past the PCI/PCIe cards instead of getting trapped underneath
them.

Note that LKML isn't the friendliest place to get MCE debugging, as it
will be considered a hardware fault and thus off-topic.
Consider an MCE like a 'check engine' light in your car. It doesn't tell
you what's wrong, just that it's bad and should be investigated.

Regards,
Tony V.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

  reply	other threads:[~2008-10-24 16:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-24 12:45 MCEs Felix von Leitner
2008-10-24 16:23 ` Tony Vroon [this message]
2008-10-24 18:04 ` MCEs Andi Kleen
2008-10-24 21:44   ` MCEs Felix von Leitner
2008-10-24 23:07   ` MCEs Felix von Leitner
2008-10-25  7:00     ` MCEs Andi Kleen
2008-10-25 10:05       ` MCEs Felix von Leitner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1224865398.9632.22.camel@localhost \
    --to=tony@vroon.org \
    --cc=felix-linuxkernel@fefe.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox