From: Robert Hancock <hancockr@shaw.ca>
To: Jack Malmostoso <malmostoso@email.it>
Cc: linux-kernel@vger.kernel.org
Subject: Re: MCE on an NForce4 board (again)
Date: Fri, 23 Mar 2007 08:24:14 -0600 [thread overview]
Message-ID: <4603E30E.3050203@shaw.ca> (raw)
In-Reply-To: <fa.7WMpSSoKT5HrVvZw161tkP591C8@ifi.uio.no>
Jack Malmostoso wrote:
> Hi there list,
>
> this is a repost from the x86-64.org discuss list. I think it could be
> relevant here too, if not please excuse me and forget this message. I am not
> subscribed to the LKML, but I can follow the thread on usenet, so no need to
> CC. Feel free to do it if you think it's relevant!
>
> Since a couple of days I have been experiencing weird lockups on my AMD64
> machine running Debian Sid.
> The computer is composed like this:
>
> Asus A8N-E (NForce4 socket 939)
> Athlon64 X2 3800+
> 2x512MB Corsair TwinX
> 2x250GB Sata Western Digital
> 2xDVD/DVDRW on PATA channels
>
> None of the components has *ever* been overclocked and the whole is one year
> old.
>
> I have logged on in single mode and tried to find a way to reproduce the
> crash. It has been enough to do a stupid script like
>
> while true; do
> tar -xjvf linux-2.6.20.3.tar.bz2 && rm -fr linux-2.6.20.3
> done
>
> and regularly, after one or two cycles, the system would lockup with this
> error:
>
> CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f
> TSC 185ef6d81ca
>
> It also gave me the hint to use mcelog, so I rebooted and installed it. The
> decoded error read:
>
> CPU 0 4 northbridge TSC 185ef6d81ca
> Northbridge Watchdog error
> bit57 = processor context corrupt
> bit61 = error uncorrected
> bus error 'generic participation, request timed out
> generic error mem transaction
> generic access, level generic'
> STATUS b200000000070f0f MCGSTATUS 4
>
> I googled and found a thread on the x86-64.org discuss list [1] that blamed
> the RAM for the
> problem. So I have started doing some tests:
>
> 1) If I use only 512MB of my RAM, alternatively, I don't get the error.
> 2) Memtest+ has been running for 10hrs and no errors have been detected.
> I'll have it running for the day just to be sure.
>
> Additionally, right before the MCE, I could read another error:
>
> ata3: CPB flags CMD err, flags=0x11
>
> Googling this brought up some threads in the LKML about the sata_nv driver
> (the ADMA bit, I think).
It means the controller had an error sending commands to the drive.
There's a possibility that if the controller is taking a long time
retrying commands, etc. the CPU might complain about a bus transaction
timing out. I'd check the SATA cable for that drive, maybe try a
different one.
>
> Since I am running a 2.6.20 kernel (a testing version from the Debian kernel
> team) I have tried booting older kernels but looks like anything I have
> tried does not work (but this is not your problem).
>
> So I have booted a livecd (I had around an Ubuntu with a 2.6.12 kernel) and
> with my great surprise the machine worked without problems.
>
> Now, do you think it would be even slightly possible that some regression in
> the sata_nv module could trigger an MCE? If not, would you put your money on
> the RAM, or could the motherboard be blamed? I hope it's not the CPU ;)
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/
next parent reply other threads:[~2007-03-23 14:25 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <fa.7WMpSSoKT5HrVvZw161tkP591C8@ifi.uio.no>
2007-03-23 14:24 ` Robert Hancock [this message]
2007-03-23 10:16 MCE on an NForce4 board (again) Jack Malmostoso
2007-03-23 11:20 ` Avuton Olrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4603E30E.3050203@shaw.ca \
--to=hancockr@shaw.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=malmostoso@email.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.