From: Yazen Ghannam <yazen.ghannam@amd.com>
To: <x86@kernel.org>, Tony Luck <tony.luck@intel.com>,
"Rafael J. Wysocki" <rafael@kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <linux-edac@vger.kernel.org>,
<Smita.KoralahalliChannabasappa@amd.com>,
Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
Nikolay Borisov <nik.borisov@suse.com>,
<linux-acpi@vger.kernel.org>,
"Yazen Ghannam" <yazen.ghannam@amd.com>
Subject: [PATCH v5 12/20] x86/mce: Move machine_check_poll() status checks to helper functions
Date: Mon, 25 Aug 2025 17:33:09 +0000 [thread overview]
Message-ID: <20250825-wip-mca-updates-v5-12-865768a2eef8@amd.com> (raw)
In-Reply-To: <20250825-wip-mca-updates-v5-0-865768a2eef8@amd.com>
There are a number of generic and vendor-specific status checks in
machine_check_poll(). These are used to determine if an error should be
skipped.
Move these into helper functions. Future vendor-specific checks will be
added to the helpers.
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Notes:
Link:
https://lore.kernel.org/r/20250624-wip-mca-updates-v4-15-236dd74f645f@amd.com
v4->v5:
* No change.
v3->v4:
* No change.
v2->v3:
* Add tags from Qiuxu and Tony.
v1->v2:
* Change log_poll_error() to should_log_poll_error().
* Keep code comment.
arch/x86/kernel/cpu/mce/core.c | 88 +++++++++++++++++++++++-------------------
1 file changed, 48 insertions(+), 40 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 21a5ea239e93..b3593a370bc9 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -714,6 +714,52 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i)
DEFINE_PER_CPU(unsigned, mce_poll_count);
+/*
+ * Newer Intel systems that support software error
+ * recovery need to make additional checks. Other
+ * CPUs should skip over uncorrected errors, but log
+ * everything else.
+ */
+static bool ser_should_log_poll_error(struct mce *m)
+{
+ /* Log "not enabled" (speculative) errors */
+ if (!(m->status & MCI_STATUS_EN))
+ return true;
+
+ /*
+ * Log UCNA (SDM: 15.6.3 "UCR Error Classification")
+ * UC == 1 && PCC == 0 && S == 0
+ */
+ if (!(m->status & MCI_STATUS_PCC) && !(m->status & MCI_STATUS_S))
+ return true;
+
+ return false;
+}
+
+static bool should_log_poll_error(enum mcp_flags flags, struct mce_hw_err *err)
+{
+ struct mce *m = &err->m;
+
+ /* If this entry is not valid, ignore it. */
+ if (!(m->status & MCI_STATUS_VAL))
+ return false;
+
+ /*
+ * If we are logging everything (at CPU online) or this
+ * is a corrected error, then we must log it.
+ */
+ if ((flags & MCP_UC) || !(m->status & MCI_STATUS_UC))
+ return true;
+
+ if (mca_cfg.ser)
+ return ser_should_log_poll_error(m);
+
+ if (m->status & MCI_STATUS_UC)
+ return false;
+
+ return true;
+}
+
/*
* Poll for corrected events or events that happened before reset.
* Those are just logged through /dev/mcelog.
@@ -765,48 +811,10 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
if (!mca_cfg.cmci_disabled)
mce_track_storm(m);
- /* If this entry is not valid, ignore it */
- if (!(m->status & MCI_STATUS_VAL))
+ /* Verify that the error should be logged based on hardware conditions. */
+ if (!should_log_poll_error(flags, &err))
continue;
- /*
- * If we are logging everything (at CPU online) or this
- * is a corrected error, then we must log it.
- */
- if ((flags & MCP_UC) || !(m->status & MCI_STATUS_UC))
- goto log_it;
-
- /*
- * Newer Intel systems that support software error
- * recovery need to make additional checks. Other
- * CPUs should skip over uncorrected errors, but log
- * everything else.
- */
- if (!mca_cfg.ser) {
- if (m->status & MCI_STATUS_UC)
- continue;
- goto log_it;
- }
-
- /* Log "not enabled" (speculative) errors */
- if (!(m->status & MCI_STATUS_EN))
- goto log_it;
-
- /*
- * Log UCNA (SDM: 15.6.3 "UCR Error Classification")
- * UC == 1 && PCC == 0 && S == 0
- */
- if (!(m->status & MCI_STATUS_PCC) && !(m->status & MCI_STATUS_S))
- goto log_it;
-
- /*
- * Skip anything else. Presumption is that our read of this
- * bank is racing with a machine check. Leave the log alone
- * for do_machine_check() to deal with it.
- */
- continue;
-
-log_it:
mce_read_aux(&err, i);
m->severity = mce_severity(m, NULL, NULL, false);
/*
--
2.51.0
next prev parent reply other threads:[~2025-08-25 17:33 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-25 17:32 [PATCH v5 00/20] AMD MCA interrupts rework Yazen Ghannam
2025-08-25 17:32 ` [PATCH v5 01/20] x86/mce/amd: Rename threshold restart function Yazen Ghannam
2025-08-25 17:32 ` [PATCH v5 02/20] x86/mce/amd: Remove return value for mce_threshold_{create,remove}_device() Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 03/20] x86/mce/amd: Remove smca_banks_map Yazen Ghannam
2025-08-25 18:19 ` Borislav Petkov
2025-08-25 19:54 ` Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 04/20] x86/mce/amd: Put list_head in threshold_bank Yazen Ghannam
2025-09-01 15:41 ` Nikolay Borisov
2025-09-01 16:41 ` Borislav Petkov
2025-08-25 17:33 ` [PATCH v5 05/20] x86/mce: Cleanup bank processing on init Yazen Ghannam
2025-08-26 12:35 ` Borislav Petkov
2025-08-26 13:47 ` Yazen Ghannam
2025-08-26 14:33 ` Borislav Petkov
2025-08-25 17:33 ` [PATCH v5 06/20] x86/mce: Remove __mcheck_cpu_init_early() Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 07/20] x86/mce: Reorder __mcheck_cpu_init_generic() call Yazen Ghannam
2025-09-01 17:07 ` Borislav Petkov
2025-09-02 13:30 ` Yazen Ghannam
2025-09-02 16:26 ` Borislav Petkov
2025-09-02 17:14 ` Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 08/20] x86/mce: Define BSP-only init Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 09/20] x86/mce: Define BSP-only SMCA init Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 10/20] x86/mce: Do 'UNKNOWN' vendor check early Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 11/20] x86/mce: Separate global and per-CPU quirks Yazen Ghannam
2025-08-25 17:33 ` Yazen Ghannam [this message]
2025-08-25 17:33 ` [PATCH v5 13/20] x86/mce: Unify AMD THR handler with MCA Polling Yazen Ghannam
2025-09-02 11:10 ` Borislav Petkov
2025-09-02 13:37 ` Yazen Ghannam
2025-09-02 17:04 ` Borislav Petkov
2025-09-02 17:25 ` Yazen Ghannam
2025-09-03 9:48 ` Borislav Petkov
2025-08-25 17:33 ` [PATCH v5 14/20] x86/mce: Unify AMD DFR " Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 15/20] x86/mce/amd: Enable interrupt vectors once per-CPU on SMCA systems Yazen Ghannam
2025-09-03 10:03 ` Borislav Petkov
2025-09-03 14:00 ` Yazen Ghannam
2025-09-03 15:39 ` Borislav Petkov
2025-08-25 17:33 ` [PATCH v5 16/20] x86/mce/amd: Support SMCA Corrected Error Interrupt Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 17/20] x86/mce/amd: Remove redundant reset_block() Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 18/20] x86/mce/amd: Define threshold restart function for banks Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 19/20] x86/mce: Handle AMD threshold interrupt storms Yazen Ghannam
2025-08-25 17:33 ` [PATCH v5 20/20] x86/mce: Save and use APEI corrected threshold limit Yazen Ghannam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250825-wip-mca-updates-v5-12-865768a2eef8@amd.com \
--to=yazen.ghannam@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nik.borisov@suse.com \
--cc=qiuxu.zhuo@intel.com \
--cc=rafael@kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).