From: Borislav Petkov <borislav.petkov@amd.com>
To: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Cc: Borislav Petkov <petkovbb@googlemail.com>,
linux-kernel@vger.kernel.org,
osrc-patches <osrc-patches@elbe.amd.com>
Subject: Re: K8 ECC error with linux-2.6.32
Date: Tue, 15 Dec 2009 16:30:26 +0100 [thread overview]
Message-ID: <20091215153026.GD20880@aftab> (raw)
In-Reply-To: <200912150808.04814.johannes.hirte@fem.tu-ilmenau.de>
On Tue, Dec 15, 2009 at 08:08:04AM +0100, Johannes Hirte wrote:
> Northbridge Error, node 0, core: -1
> amd_decode_nb_mce: NBSL: 0x0005001b, NBSL: 0xa4000000
> K8 ECC error.
Yep, this is a benign GART TLB error which is not being reported but
you're using the amd64_edac module and it trips since the error is still
being logged and the module sees it. There are two fixes:
1. If you have a BIOS option with a wording like:
"Gart Table Walk Error MC reporting: Disabled/Enabled."
which should disable it.
2. If no BIOS option, the patch below should fix it. Can you please
test (against v2.6.32).
Thanks.
---
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 713ed7d..026f0cb 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -300,6 +300,12 @@ void amd_decode_nb_mce(int node_id, struct err_regs *regs, int handle_errors)
if (!handle_errors)
return;
+ /*
+ * GART TLB error reporting is disabled by default. Bail out early.
+ */
+ if (TLB_ERROR(ec) && !report_gart_errors)
+ return;
+
pr_emerg(" Northbridge Error, node %d", node_id);
/*
@@ -311,10 +317,9 @@ void amd_decode_nb_mce(int node_id, struct err_regs *regs, int handle_errors)
if (regs->nbsh & K8_NBSH_ERR_CPU_VAL)
pr_cont(", core: %u\n", (u8)(regs->nbsh & 0xf));
} else {
- pr_cont(", core: %d\n", ilog2((regs->nbsh & 0xf)));
+ pr_cont(", core: %d\n", fls((regs->nbsh & 0xf) - 1));
}
-
pr_emerg("%s.\n", EXT_ERR_MSG(xec));
if (BUS_ERROR(ec) && nb_bus_decoder)
@@ -334,21 +339,6 @@ static void amd_decode_fr_mce(u64 mc5_status)
static inline void amd_decode_err_code(unsigned int ec)
{
if (TLB_ERROR(ec)) {
- /*
- * GART errors are intended to help graphics driver developers
- * to detect bad GART PTEs. It is recommended by AMD to disable
- * GART table walk error reporting by default[1] (currently
- * being disabled in mce_cpu_quirks()) and according to the
- * comment in mce_cpu_quirks(), such GART errors can be
- * incorrectly triggered. We may see these errors anyway and
- * unless requested by the user, they won't be reported.
- *
- * [1] section 13.10.1 on BIOS and Kernel Developers Guide for
- * AMD NPT family 0Fh processors
- */
- if (!report_gart_errors)
- return;
-
pr_emerg(" Transaction: %s, Cache Level %s\n",
TT_MSG(ec), LL_MSG(ec));
} else if (MEM_ERROR(ec)) {
--
Regards/Gruss,
Boris.
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
Research | Geschäftsführer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
(OSRC) | Registergericht München, HRB Nr. 43632
next prev parent reply other threads:[~2009-12-15 15:30 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-11 21:02 K8 ECC error with linux-2.6.32 Johannes Hirte
2009-12-11 21:11 ` Johannes Hirte
2009-12-11 21:19 ` Borislav Petkov
2009-12-11 21:39 ` Johannes Hirte
2009-12-11 22:07 ` Borislav Petkov
2009-12-11 22:12 ` Johannes Hirte
2009-12-14 13:26 ` Johannes Hirte
2009-12-14 22:23 ` Borislav Petkov
2009-12-15 7:08 ` Johannes Hirte
2009-12-15 15:30 ` Borislav Petkov [this message]
2009-12-15 22:00 ` Johannes Hirte
2009-12-16 7:14 ` Borislav Petkov
2009-12-16 14:58 ` radeon KMS causes GART Table Walk Errors (was: K8 ECC error with linux-2.6.32) Johannes Hirte
2009-12-16 16:41 ` Borislav Petkov
2009-12-17 3:07 ` Johannes Hirte
2009-12-17 7:22 ` Borislav Petkov
2009-12-17 19:03 ` Johannes Hirte
2009-12-18 11:56 ` Borislav Petkov
2009-12-16 18:41 ` Jerome Glisse
2009-12-16 19:31 ` Johannes Hirte
2009-12-18 13:47 ` Johannes Hirte
2009-12-18 14:44 ` Jerome Glisse
2009-12-18 15:37 ` Johannes Hirte
2009-12-24 19:04 ` Johannes Hirte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091215153026.GD20880@aftab \
--to=borislav.petkov@amd.com \
--cc=johannes.hirte@fem.tu-ilmenau.de \
--cc=linux-kernel@vger.kernel.org \
--cc=osrc-patches@elbe.amd.com \
--cc=petkovbb@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.