public inbox for linux-edac@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/mce/amd: Filter bogus L3 deferred errors on CZN A0
@ 2026-02-28 14:08 Yazen Ghannam
  2026-03-02 16:16 ` Mario Limonciello
  0 siblings, 1 reply; 3+ messages in thread
From: Yazen Ghannam @ 2026-02-28 14:08 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, tony.luck, x86, Yazen Ghannam, Bert Karwatzki,
	Mario Limonciello

User has observed multiple L3 cache deferred errors logs after recent
kernel rework of deferred error handling. [1]

Upon inspection, the errors are determined to be bogus due to
inconsistent status values. Also, user verified that bogus MCA_DESTAT
values are present on the system even with an older kernel. [2] The
errors seem to be garbage values present in the MCA_DESTAT of some L3
cache banks. These were implicitly ignored before the recent kernel
rework because these do not generate a deferred error interrupt.

A later revision of the rework patch was merged for v6.19. This
naturally filtered out most of the bogus error logs. However, a few
signatures still remain. [3]

Add the remaining bogus signatures to the MCE filter function. Minimize
the scope of the filter to the reported CPU family/model/stepping so
that similar issues are not implicitly masked on other systems.

Fixes: 7cb735d7c0cb ("x86/mce: Unify AMD DFR handler with MCA Polling")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de                # [1]
Link: https://lore.kernel.org/6e1eda7dd55f6fa30405edf7b0f75695cf55b237.camel@web.de # [2]
Link: https://lore.kernel.org/21ba47fa8893b33b94370c2a42e5084cf0d2e975.camel@web.de # [3]
---
 arch/x86/kernel/cpu/mce/amd.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index da13c1e37f87..7a94492aa50f 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -604,6 +604,18 @@ bool amd_filter_mce(struct mce *m)
 	enum smca_bank_types bank_type = smca_get_bank_type(m->extcpu, m->bank);
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
+	/*
+	 * Bogus L3 cache deferred errors on Cezanne A0.
+	 *
+	 * Case #1: PCC bit set. This is not valid for deferred errors.
+	 * Case #2: XEC 29. This is not a valid error code.
+	 */
+	if (c->x86 == 0x19 && c->x86_model == 0x50 && c->x86_stepping == 0x0 &&
+	    bank_type == SMCA_L3_CACHE && (m->status & MCI_STATUS_DEFERRED)) {
+		if ((m->status & MCI_STATUS_PCC) || XEC(m->status, 0x3f) == 29)
+			return true;
+	}
+
 	/* See Family 17h Models 10h-2Fh Erratum #1114. */
 	if (c->x86 == 0x17 &&
 	    c->x86_model >= 0x10 && c->x86_model <= 0x2F &&
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-03 14:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-28 14:08 [PATCH] x86/mce/amd: Filter bogus L3 deferred errors on CZN A0 Yazen Ghannam
2026-03-02 16:16 ` Mario Limonciello
2026-03-03 14:00   ` Yazen Ghannam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox