From: Bert Karwatzki <spasswolf@web.de>
To: Yazen Ghannam <yazen.ghannam@amd.com>, Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>,
linux-kernel@vger.kernel.org, linux-next@vger.kernel.org,
linux-edac@vger.kernel.org, linux-acpi@vger.kernel.org,
x86@kernel.org, rafael@kernel.org, qiuxu.zhuo@intel.com,
nik.borisov@suse.com, Smita.KoralahalliChannabasappa@amd.com,
spasswolf@web.de
Subject: Re: spurious mce Hardware Error messages in next-20250912
Date: Tue, 16 Sep 2025 22:27:35 +0200 [thread overview]
Message-ID: <9488e4bf935aa1e50179019419dfee93d306ded9.camel@web.de> (raw)
In-Reply-To: <20250916140744.GA1054485@yaz-khff2.amd.com>
Am Dienstag, dem 16.09.2025 um 10:07 -0400 schrieb Yazen Ghannam:
> On Tue, Sep 16, 2025 at 11:10:55AM +0200, Borislav Petkov wrote:
> > On Mon, Sep 15, 2025 at 11:43:26PM +0200, Bert Karwatzki wrote:
> > > After re-cloning linux-next I tested next-20250911 and I get no mce error messages
> > > even if I set the check_interval to 10.
> >
> > Yazen, I've zapped everything from the handler unification onwards:
> >
> > 28e82d6f03b0 x86/mce: Save and use APEI corrected threshold limit
> > c8f4cea38959 x86/mce: Handle AMD threshold interrupt storms
> > 5a92e88ffc49 x86/mce/amd: Define threshold restart function for banks
> > 922300abd79d x86/mce/amd: Remove redundant reset_block()
> > 9b92e18973ce x86/mce/amd: Support SMCA corrected error interrupt
> > fe02d3d00b06 x86/mce/amd: Enable interrupt vectors once per-CPU on SMCA systems
> > cf6f155e848b x86/mce: Unify AMD DFR handler with MCA Polling
> > 53b3be0e79ef x86/mce: Unify AMD THR handler with MCA Polling
> >
> > until this is properly sorted out, now this close to the merge window.
> >
> > Thanks, Bert, for reporting!
> >
>
> No problem, thanks Boris.
>
> Bert, can you please try the following patch on next-20250912?
>
> I expect that you will see the "debug" message, but the regular MCA
> logging should be gone.
>
Applied your patch on next-20250912, these are now the only messages
I get from mce:
[ 333.337544] [ C0] mce: DEBUG: CPU0 Bank:11 Status:0x8700aa0800000000
[ 333.337556] [ C0] mce: DEBUG: CPU0 Bank:14 Status:0x8724aa0800000000
[ 661.017608] [ C0] mce: DEBUG: CPU0 Bank:11 Status:0x8424aa4800a9413b
[ 661.017619] [ C0] mce: DEBUG: CPU0 Bank:14 Status:0x8700aa0800000000
[ 988.697243] [ C0] mce: DEBUG: CPU0 Bank:11 Status:0x8700aa0800000000
[ 988.697250] [ C0] mce: DEBUG: CPU0 Bank:14 Status:0x8724ab8800000000
[ 1316.377571] [ C0] mce: DEBUG: CPU0 Bank:11 Status:0x8700a28800000000
[ 1316.377582] [ C0] mce: DEBUG: CPU0 Bank:14 Status:0x8400aa4800a7413c
> Also, we haven't been able to reproduce this issue yet. So thank you for
> your help. It's much appreciated.
>
> Thanks,
> Yazen
>
It could still be a hardware error, I'm also going to run memtest86+.
Bert Karwatzki
next prev parent reply other threads:[~2025-09-16 20:28 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 1:00 spurious mce Hardware Error messages in next-20250912 Bert Karwatzki
2025-09-15 17:55 ` Yazen Ghannam
2025-09-15 21:03 ` Bert Karwatzki
2025-09-15 21:43 ` Bert Karwatzki
2025-09-16 9:10 ` Borislav Petkov
2025-09-16 14:07 ` Yazen Ghannam
2025-09-16 20:27 ` Bert Karwatzki [this message]
2025-09-17 7:13 ` Bert Karwatzki
2025-09-17 14:41 ` Yazen Ghannam
2025-09-17 15:33 ` Bert Karwatzki
2025-09-17 19:26 ` Yazen Ghannam
2025-09-17 21:15 ` Yazen Ghannam
2025-09-17 22:01 ` Bert Karwatzki
2025-09-18 10:20 ` Nikolay Borisov
2025-09-18 21:00 ` Yazen Ghannam
2025-09-18 21:04 ` Luck, Tony
2025-09-18 21:14 ` Yazen Ghannam
2025-09-18 22:07 ` Bert Karwatzki
2025-10-09 13:20 ` Yazen Ghannam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9488e4bf935aa1e50179019419dfee93d306ded9.camel@web.de \
--to=spasswolf@web.de \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=bp@alien8.de \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=nik.borisov@suse.com \
--cc=qiuxu.zhuo@intel.com \
--cc=rafael@kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yazen.ghannam@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).