From: Borislav Petkov <borislav.petkov@amd.com>
To: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Cc: Borislav Petkov <petkovbb@googlemail.com>,
linux-kernel@vger.kernel.org,
osrc-patches <osrc-patches@elbe.amd.com>
Subject: Re: K8 ECC error with linux-2.6.32
Date: Tue, 15 Dec 2009 16:30:26 +0100 [thread overview]
Message-ID: <20091215153026.GD20880@aftab> (raw)
In-Reply-To: <200912150808.04814.johannes.hirte@fem.tu-ilmenau.de>
On Tue, Dec 15, 2009 at 08:08:04AM +0100, Johannes Hirte wrote:
> Northbridge Error, node 0, core: -1
> amd_decode_nb_mce: NBSL: 0x0005001b, NBSL: 0xa4000000
> K8 ECC error.
Yep, this is a benign GART TLB error which is not being reported but
you're using the amd64_edac module and it trips since the error is still
being logged and the module sees it. There are two fixes:
1. If you have a BIOS option with a wording like:
"Gart Table Walk Error MC reporting: Disabled/Enabled."
which should disable it.
2. If no BIOS option, the patch below should fix it. Can you please
test (against v2.6.32).
Thanks.
---
diff --git a/drivers/edac/edac_mce_amd.c b/drivers/edac/edac_mce_amd.c
index 713ed7d..026f0cb 100644
--- a/drivers/edac/edac_mce_amd.c
+++ b/drivers/edac/edac_mce_amd.c
@@ -300,6 +300,12 @@ void amd_decode_nb_mce(int node_id, struct err_regs *regs, int handle_errors)
if (!handle_errors)
return;
+ /*
+ * GART TLB error reporting is disabled by default. Bail out early.
+ */
+ if (TLB_ERROR(ec) && !report_gart_errors)
+ return;
+
pr_emerg(" Northbridge Error, node %d", node_id);
/*
@@ -311,10 +317,9 @@ void amd_decode_nb_mce(int node_id, struct err_regs *regs, int handle_errors)
if (regs->nbsh & K8_NBSH_ERR_CPU_VAL)
pr_cont(", core: %u\n", (u8)(regs->nbsh & 0xf));
} else {
- pr_cont(", core: %d\n", ilog2((regs->nbsh & 0xf)));
+ pr_cont(", core: %d\n", fls((regs->nbsh & 0xf) - 1));
}
-
pr_emerg("%s.\n", EXT_ERR_MSG(xec));
if (BUS_ERROR(ec) && nb_bus_decoder)
@@ -334,21 +339,6 @@ static void amd_decode_fr_mce(u64 mc5_status)
static inline void amd_decode_err_code(unsigned int ec)
{
if (TLB_ERROR(ec)) {
- /*
- * GART errors are intended to help graphics driver developers
- * to detect bad GART PTEs. It is recommended by AMD to disable
- * GART table walk error reporting by default[1] (currently
- * being disabled in mce_cpu_quirks()) and according to the
- * comment in mce_cpu_quirks(), such GART errors can be
- * incorrectly triggered. We may see these errors anyway and
- * unless requested by the user, they won't be reported.
- *
- * [1] section 13.10.1 on BIOS and Kernel Developers Guide for
- * AMD NPT family 0Fh processors
- */
- if (!report_gart_errors)
- return;
-
pr_emerg(" Transaction: %s, Cache Level %s\n",
TT_MSG(ec), LL_MSG(ec));
} else if (MEM_ERROR(ec)) {
--
Regards/Gruss,
Boris.
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
Research | Geschäftsführer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
(OSRC) | Registergericht München, HRB Nr. 43632
next prev parent reply other threads:[~2009-12-15 15:30 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-11 21:02 K8 ECC error with linux-2.6.32 Johannes Hirte
2009-12-11 21:11 ` Johannes Hirte
2009-12-11 21:19 ` Borislav Petkov
2009-12-11 21:39 ` Johannes Hirte
2009-12-11 22:07 ` Borislav Petkov
2009-12-11 22:12 ` Johannes Hirte
2009-12-14 13:26 ` Johannes Hirte
2009-12-14 22:23 ` Borislav Petkov
2009-12-15 7:08 ` Johannes Hirte
2009-12-15 15:30 ` Borislav Petkov [this message]
2009-12-15 22:00 ` Johannes Hirte
2009-12-16 7:14 ` Borislav Petkov
2009-12-16 14:58 ` radeon KMS causes GART Table Walk Errors (was: K8 ECC error with linux-2.6.32) Johannes Hirte
2009-12-16 16:41 ` Borislav Petkov
2009-12-17 3:07 ` Johannes Hirte
2009-12-17 7:22 ` Borislav Petkov
2009-12-17 19:03 ` Johannes Hirte
2009-12-18 11:56 ` Borislav Petkov
2009-12-16 18:41 ` Jerome Glisse
2009-12-16 19:31 ` Johannes Hirte
2009-12-18 13:47 ` Johannes Hirte
2009-12-18 14:44 ` Jerome Glisse
2009-12-18 15:37 ` Johannes Hirte
2009-12-24 19:04 ` Johannes Hirte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091215153026.GD20880@aftab \
--to=borislav.petkov@amd.com \
--cc=johannes.hirte@fem.tu-ilmenau.de \
--cc=linux-kernel@vger.kernel.org \
--cc=osrc-patches@elbe.amd.com \
--cc=petkovbb@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox