Linux EDAC development
 help / color / mirror / Atom feed
* [PATCH] x86/mce: Schedule mce_setup() on correct CPU for CPER decoding
@ 2023-04-17 16:20 Yazen Ghannam
  2023-04-17 17:17 ` Luck, Tony
  2023-06-15 15:20 ` Borislav Petkov
  0 siblings, 2 replies; 13+ messages in thread
From: Yazen Ghannam @ 2023-04-17 16:20 UTC (permalink / raw)
  To: linux-edac; +Cc: linux-kernel, tony.luck, x86, Yazen Ghannam

Scalable MCA systems may report errors found during boot-time polling
through the ACPI Boot Error Record Table (BERT). The errors are logged
in an "x86 Processor" Common Platform Error Record (CPER). The format of
the x86 CPER does not include a logical CPU number, but it does provide
the logical APIC ID for the logical CPU. Also, it does not explicitly
provide MCA error information, but it can share this information using
an "MSR Context" defined in the CPER format.

The MCA error information is parsed by
1) Checking that the context matches the Scalable MCA register space.
2) Finding the logical CPU that matches the logical APIC ID from the
   CPER.
3) Filling in struct mce with the relevant data and logging it.

All the above is done when the BERT is processed during late init. This
can be scheduled on any CPU, and it may be preemptible.

This results in two issues.
1) mce_setup() includes a call to smp_processor_id(). This will throw a
   warning if preemption is enabled.
2) mce_setup() will pull info from the executing CPU, so some info in
   struct mce may be incorrect for the CPU with the error. For example,
   in a dual-socket system, an error logged in socket 1 CPU but
   processed by a socket 0 CPU will save the PPIN of the socket 0 CPU.

Fix both issues by scheduling mce_setup() to run on the logical CPU
indicated in the error record. Preemption is disabled when calling
smp_call_function_*() resolving issue #1. And the error info is gathered
from the proper logical CPU resolving issue #2.

Furthermore, smp_call_function_*() handles calls with invalid CPU
numbers, etc. So extra checking by the caller is not necessary.

Fixes: 4a24d80b8c3e ("x86/mce, cper: Pass x86 CPER through the MCA handling chain")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/cpu/mce/apei.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 8ed341714686..5c0381a4a66f 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -63,6 +63,11 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
 }
 EXPORT_SYMBOL_GPL(apei_mce_report_mem_error);
 
+static void __mce_setup(void *info)
+{
+	mce_setup((struct mce *)info);
+}
+
 int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 {
 	const u64 *i_mce = ((const u64 *) (ctx_info + 1));
@@ -97,20 +102,13 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 	if (ctx_info->reg_arr_size < 48)
 		return -EINVAL;
 
-	mce_setup(&m);
-
-	m.extcpu = -1;
-	m.socketid = -1;
-
-	for_each_possible_cpu(cpu) {
-		if (cpu_data(cpu).initial_apicid == lapic_id) {
-			m.extcpu = cpu;
-			m.socketid = cpu_data(m.extcpu).phys_proc_id;
+	for_each_possible_cpu(cpu)
+		if (cpu_data(cpu).initial_apicid == lapic_id)
 			break;
-		}
-	}
 
-	m.apicid = lapic_id;
+	if (smp_call_function_single(cpu, __mce_setup, &m, 1))
+		return -EINVAL;
+
 	m.bank = (ctx_info->msr_addr >> 4) & 0xFF;
 	m.status = *i_mce;
 	m.addr = *(i_mce + 1);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-16 16:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-17 16:20 [PATCH] x86/mce: Schedule mce_setup() on correct CPU for CPER decoding Yazen Ghannam
2023-04-17 17:17 ` Luck, Tony
2023-04-17 17:28   ` Yazen Ghannam
2023-04-17 17:39     ` Luck, Tony
2023-06-09 14:26       ` Yazen Ghannam
2023-06-15 15:20 ` Borislav Petkov
2023-06-15 15:34   ` Yazen Ghannam
2023-06-15 16:20     ` Borislav Petkov
2023-06-15 17:02       ` Yazen Ghannam
2023-06-15 17:39         ` Yazen Ghannam
2023-06-15 17:54           ` Luck, Tony
2023-06-16 14:16             ` Yazen Ghannam
2023-06-16 16:05               ` Luck, Tony

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox