* [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups
@ 2009-07-08 22:31 Andi Kleen
2009-07-08 22:31 ` [PATCH] [1/8] x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE Andi Kleen
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
This patchkit removes the old 32bit x86 machine check code.
This was scheduled in feature-removal-schedule.txt for 2.6.32, so the
removal can be queued now.
Also a few minor fixes for issues I noticed while doing that.
They are all fairly harmless and more cleanups than real bug fixes.
And finally(mostly) addresses two of the left over review comments:
- moving the per bank state into a single structure. The CMCI ownership
information is still separate for now because that would have needed more
restructuring, not appropiate for this "trivial" patch series
- centralize the calculation of MSR addresses
-Andi
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [1/8] x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [2/8] x86: mce: Update X86_MCE description in x86/Kconfig Andi Kleen
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
Add a missing depency for ANCIENT_MCE. It didn't matter
in practice because the ANCIENT code wasn't compiled without
X86_MCE, but it's better to express that clearly in
Kconfig.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/x86/Kconfig
===================================================================
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -826,7 +826,7 @@ config X86_MCE_AMD
config X86_ANCIENT_MCE
def_bool n
- depends on X86_32
+ depends on X86_32 && X86_MCE
prompt "Support for old Pentium 5 / WinChip machine checks"
---help---
Include support for machine check handling on old Pentium 5 or WinChip
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [2/8] x86: mce: Update X86_MCE description in x86/Kconfig
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
2009-07-08 22:31 ` [PATCH] [1/8] x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [3/8] x86: mce: Remove old i386 machine check code Andi Kleen
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
- Clarify that this config controls thermal throttling
reporting too
- Clarify the types of errors reported by machine checks
- Drop references to ancient CPUs.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/Kconfig | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
Index: linux/arch/x86/Kconfig
===================================================================
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -777,20 +777,12 @@ config X86_REROUTE_FOR_BROKEN_BOOT_IRQS
increased on these systems.
config X86_MCE
- bool "Machine Check Exception"
+ bool "Machine Check / overheating reporting"
---help---
- Machine Check Exception support allows the processor to notify the
- kernel if it detects a problem (e.g. overheating, component failure).
+ Machine Check support allows the processor to notify the
+ kernel if it detects a problem (e.g. overheating, data corruption).
The action the kernel takes depends on the severity of the problem,
- ranging from a warning message on the console, to halting the machine.
- Your processor must be a Pentium or newer to support this - check the
- flags in /proc/cpuinfo for mce. Note that some older Pentium systems
- have a design flaw which leads to false MCE events - hence MCE is
- disabled on all P5 processors, unless explicitly enabled with "mce"
- as a boot argument. Similarly, if MCE is built in and creates a
- problem on some new non-standard machine, you can boot with "nomce"
- to disable it. MCE support simply ignores non-MCE processors like
- the 386 and 486, so nearly everyone can say Y here.
+ ranging from warning messages to halting the machine.
config X86_OLD_MCE
depends on X86_32 && X86_MCE
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [3/8] x86: mce: Remove old i386 machine check code
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
2009-07-08 22:31 ` [PATCH] [1/8] x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE Andi Kleen
2009-07-08 22:31 ` [PATCH] [2/8] x86: mce: Update X86_MCE description in x86/Kconfig Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [4/8] x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE Andi Kleen
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
As announced in feature-remove-schedule.txt remove CONFIG_X86_OLD_MCE
This patch only removes code.
The ancient machine check code for very old systems that are not supported
by CONFIG_X86_NEW_MCE is still kept.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
Documentation/feature-removal-schedule.txt | 10 -
arch/x86/Kconfig | 35 ------
arch/x86/include/asm/mce.h | 11 -
arch/x86/kernel/cpu/mcheck/Makefile | 2
arch/x86/kernel/cpu/mcheck/k7.c | 116 --------------------
arch/x86/kernel/cpu/mcheck/mce.c | 47 --------
arch/x86/kernel/cpu/mcheck/non-fatal.c | 94 ----------------
arch/x86/kernel/cpu/mcheck/p4.c | 163 -----------------------------
arch/x86/kernel/cpu/mcheck/p6.c | 127 ----------------------
9 files changed, 2 insertions(+), 603 deletions(-)
Index: linux/arch/x86/Kconfig
===================================================================
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -784,21 +784,10 @@ config X86_MCE
The action the kernel takes depends on the severity of the problem,
ranging from warning messages to halting the machine.
-config X86_OLD_MCE
- depends on X86_32 && X86_MCE
- bool "Use legacy machine check code (will go away)"
- default n
- select X86_ANCIENT_MCE
- ---help---
- Use the old i386 machine check code. This is merely intended for
- testing in a transition period. Try this if you run into any machine
- check related software problems, but report the problem to
- linux-kernel. When in doubt say no.
-
config X86_NEW_MCE
depends on X86_MCE
bool
- default y if (!X86_OLD_MCE && X86_32) || X86_64
+ default y
config X86_MCE_INTEL
def_bool y
@@ -838,29 +827,9 @@ config X86_MCE_INJECT
If you don't know what a machine check is and you don't do kernel
QA it is safe to say n.
-config X86_MCE_NONFATAL
- tristate "Check for non-fatal errors on AMD Athlon/Duron / Intel Pentium 4"
- depends on X86_OLD_MCE
- ---help---
- Enabling this feature starts a timer that triggers every 5 seconds which
- will look at the machine check registers to see if anything happened.
- Non-fatal problems automatically get corrected (but still logged).
- Disable this if you don't want to see these messages.
- Seeing the messages this option prints out may be indicative of dying
- or out-of-spec (ie, overclocked) hardware.
- This option only does something on certain CPUs.
- (AMD Athlon/Duron and Intel Pentium 4)
-
-config X86_MCE_P4THERMAL
- bool "check for P4 thermal throttling interrupt."
- depends on X86_OLD_MCE && X86_MCE && (X86_UP_APIC || SMP)
- ---help---
- Enabling this feature will cause a message to be printed when the P4
- enters thermal throttling.
-
config X86_THERMAL_VECTOR
def_bool y
- depends on X86_MCE_P4THERMAL || X86_MCE_INTEL
+ depends on X86_MCE_INTEL
config VM86
bool "Enable VM86 support" if EMBEDDED
Index: linux/arch/x86/kernel/cpu/mcheck/Makefile
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/Makefile
+++ linux/arch/x86/kernel/cpu/mcheck/Makefile
@@ -1,11 +1,9 @@
obj-y = mce.o
obj-$(CONFIG_X86_NEW_MCE) += mce-severity.o
-obj-$(CONFIG_X86_OLD_MCE) += k7.o p4.o p6.o
obj-$(CONFIG_X86_ANCIENT_MCE) += winchip.o p5.o
obj-$(CONFIG_X86_MCE_INTEL) += mce_intel.o
obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o
-obj-$(CONFIG_X86_MCE_NONFATAL) += non-fatal.o
obj-$(CONFIG_X86_MCE_THRESHOLD) += threshold.o
obj-$(CONFIG_X86_MCE_INJECT) += mce-inject.o
Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c
@@ -58,8 +58,6 @@ void (*machine_check_vector)(struct pt_r
int mce_disabled __read_mostly;
-#ifdef CONFIG_X86_NEW_MCE
-
#define MISC_MCELOG_MINOR 227
#define SPINUNIT 100 /* 100ns */
@@ -1993,51 +1991,6 @@ static __init int mce_init_device(void)
device_initcall(mce_init_device);
-#else /* CONFIG_X86_OLD_MCE: */
-
-int nr_mce_banks;
-EXPORT_SYMBOL_GPL(nr_mce_banks); /* non-fatal.o */
-
-/* This has to be run for each processor */
-void mcheck_init(struct cpuinfo_x86 *c)
-{
- if (mce_disabled)
- return;
-
- switch (c->x86_vendor) {
- case X86_VENDOR_AMD:
- amd_mcheck_init(c);
- break;
-
- case X86_VENDOR_INTEL:
- if (c->x86 == 5)
- intel_p5_mcheck_init(c);
- if (c->x86 == 6)
- intel_p6_mcheck_init(c);
- if (c->x86 == 15)
- intel_p4_mcheck_init(c);
- break;
-
- case X86_VENDOR_CENTAUR:
- if (c->x86 == 5)
- winchip_mcheck_init(c);
- break;
-
- default:
- break;
- }
- printk(KERN_INFO "mce: CPU supports %d MCE banks\n", nr_mce_banks);
-}
-
-static int __init mcheck_enable(char *str)
-{
- mce_p5_enabled = 1;
- return 1;
-}
-__setup("mce", mcheck_enable);
-
-#endif /* CONFIG_X86_OLD_MCE */
-
/*
* Old style boot options parsing. Only for compatibility.
*/
Index: linux/arch/x86/include/asm/mce.h
===================================================================
--- linux.orig/arch/x86/include/asm/mce.h
+++ linux/arch/x86/include/asm/mce.h
@@ -115,13 +115,6 @@ void mcheck_init(struct cpuinfo_x86 *c);
static inline void mcheck_init(struct cpuinfo_x86 *c) {}
#endif
-#ifdef CONFIG_X86_OLD_MCE
-extern int nr_mce_banks;
-void amd_mcheck_init(struct cpuinfo_x86 *c);
-void intel_p4_mcheck_init(struct cpuinfo_x86 *c);
-void intel_p6_mcheck_init(struct cpuinfo_x86 *c);
-#endif
-
#ifdef CONFIG_X86_ANCIENT_MCE
void intel_p5_mcheck_init(struct cpuinfo_x86 *c);
void winchip_mcheck_init(struct cpuinfo_x86 *c);
@@ -208,11 +201,7 @@ extern void (*threshold_cpu_callback)(un
void intel_init_thermal(struct cpuinfo_x86 *c);
-#ifdef CONFIG_X86_NEW_MCE
void mce_log_therm_throt_event(__u64 status);
-#else
-static inline void mce_log_therm_throt_event(__u64 status) {}
-#endif
#endif /* __KERNEL__ */
#endif /* _ASM_X86_MCE_H */
Index: linux/arch/x86/kernel/cpu/mcheck/k7.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/k7.c
+++ /dev/null
@@ -1,116 +0,0 @@
-/*
- * Athlon specific Machine Check Exception Reporting
- * (C) Copyright 2002 Dave Jones <davej@redhat.com>
- */
-#include <linux/interrupt.h>
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/init.h>
-#include <linux/smp.h>
-
-#include <asm/processor.h>
-#include <asm/system.h>
-#include <asm/mce.h>
-#include <asm/msr.h>
-
-/* Machine Check Handler For AMD Athlon/Duron: */
-static void k7_machine_check(struct pt_regs *regs, long error_code)
-{
- u32 alow, ahigh, high, low;
- u32 mcgstl, mcgsth;
- int recover = 1;
- int i;
-
- rdmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
- if (mcgstl & (1<<0)) /* Recoverable ? */
- recover = 0;
-
- printk(KERN_EMERG "CPU %d: Machine Check Exception: %08x%08x\n",
- smp_processor_id(), mcgsth, mcgstl);
-
- for (i = 1; i < nr_mce_banks; i++) {
- rdmsr(MSR_IA32_MC0_STATUS+i*4, low, high);
- if (high & (1<<31)) {
- char misc[20];
- char addr[24];
-
- misc[0] = '\0';
- addr[0] = '\0';
-
- if (high & (1<<29))
- recover |= 1;
- if (high & (1<<25))
- recover |= 2;
- high &= ~(1<<31);
-
- if (high & (1<<27)) {
- rdmsr(MSR_IA32_MC0_MISC+i*4, alow, ahigh);
- snprintf(misc, 20, "[%08x%08x]", ahigh, alow);
- }
- if (high & (1<<26)) {
- rdmsr(MSR_IA32_MC0_ADDR+i*4, alow, ahigh);
- snprintf(addr, 24, " at %08x%08x", ahigh, alow);
- }
-
- printk(KERN_EMERG "CPU %d: Bank %d: %08x%08x%s%s\n",
- smp_processor_id(), i, high, low, misc, addr);
-
- /* Clear it: */
- wrmsr(MSR_IA32_MC0_STATUS+i*4, 0UL, 0UL);
- /* Serialize: */
- wmb();
- add_taint(TAINT_MACHINE_CHECK);
- }
- }
-
- if (recover & 2)
- panic("CPU context corrupt");
- if (recover & 1)
- panic("Unable to continue");
-
- printk(KERN_EMERG "Attempting to continue.\n");
-
- mcgstl &= ~(1<<2);
- wrmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
-}
-
-
-/* AMD K7 machine check is Intel like: */
-void amd_mcheck_init(struct cpuinfo_x86 *c)
-{
- u32 l, h;
- int i;
-
- if (!cpu_has(c, X86_FEATURE_MCE))
- return;
-
- machine_check_vector = k7_machine_check;
- /* Make sure the vector pointer is visible before we enable MCEs: */
- wmb();
-
- printk(KERN_INFO "Intel machine check architecture supported.\n");
-
- rdmsr(MSR_IA32_MCG_CAP, l, h);
- if (l & (1<<8)) /* Control register present ? */
- wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
- nr_mce_banks = l & 0xff;
-
- /*
- * Clear status for MC index 0 separately, we don't touch CTL,
- * as some K7 Athlons cause spurious MCEs when its enabled:
- */
- if (boot_cpu_data.x86 == 6) {
- wrmsr(MSR_IA32_MC0_STATUS, 0x0, 0x0);
- i = 1;
- } else
- i = 0;
-
- for (; i < nr_mce_banks; i++) {
- wrmsr(MSR_IA32_MC0_CTL+4*i, 0xffffffff, 0xffffffff);
- wrmsr(MSR_IA32_MC0_STATUS+4*i, 0x0, 0x0);
- }
-
- set_in_cr4(X86_CR4_MCE);
- printk(KERN_INFO "Intel machine check reporting enabled on CPU#%d.\n",
- smp_processor_id());
-}
Index: linux/arch/x86/kernel/cpu/mcheck/p4.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/p4.c
+++ /dev/null
@@ -1,163 +0,0 @@
-/*
- * P4 specific Machine Check Exception Reporting
- */
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/init.h>
-#include <linux/smp.h>
-
-#include <asm/processor.h>
-#include <asm/mce.h>
-#include <asm/msr.h>
-
-/* as supported by the P4/Xeon family */
-struct intel_mce_extended_msrs {
- u32 eax;
- u32 ebx;
- u32 ecx;
- u32 edx;
- u32 esi;
- u32 edi;
- u32 ebp;
- u32 esp;
- u32 eflags;
- u32 eip;
- /* u32 *reserved[]; */
-};
-
-static int mce_num_extended_msrs;
-
-/* P4/Xeon Extended MCE MSR retrieval, return 0 if unsupported */
-static void intel_get_extended_msrs(struct intel_mce_extended_msrs *r)
-{
- u32 h;
-
- rdmsr(MSR_IA32_MCG_EAX, r->eax, h);
- rdmsr(MSR_IA32_MCG_EBX, r->ebx, h);
- rdmsr(MSR_IA32_MCG_ECX, r->ecx, h);
- rdmsr(MSR_IA32_MCG_EDX, r->edx, h);
- rdmsr(MSR_IA32_MCG_ESI, r->esi, h);
- rdmsr(MSR_IA32_MCG_EDI, r->edi, h);
- rdmsr(MSR_IA32_MCG_EBP, r->ebp, h);
- rdmsr(MSR_IA32_MCG_ESP, r->esp, h);
- rdmsr(MSR_IA32_MCG_EFLAGS, r->eflags, h);
- rdmsr(MSR_IA32_MCG_EIP, r->eip, h);
-}
-
-static void intel_machine_check(struct pt_regs *regs, long error_code)
-{
- u32 alow, ahigh, high, low;
- u32 mcgstl, mcgsth;
- int recover = 1;
- int i;
-
- rdmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
- if (mcgstl & (1<<0)) /* Recoverable ? */
- recover = 0;
-
- printk(KERN_EMERG "CPU %d: Machine Check Exception: %08x%08x\n",
- smp_processor_id(), mcgsth, mcgstl);
-
- if (mce_num_extended_msrs > 0) {
- struct intel_mce_extended_msrs dbg;
-
- intel_get_extended_msrs(&dbg);
-
- printk(KERN_DEBUG "CPU %d: EIP: %08x EFLAGS: %08x\n"
- "\teax: %08x ebx: %08x ecx: %08x edx: %08x\n"
- "\tesi: %08x edi: %08x ebp: %08x esp: %08x\n",
- smp_processor_id(), dbg.eip, dbg.eflags,
- dbg.eax, dbg.ebx, dbg.ecx, dbg.edx,
- dbg.esi, dbg.edi, dbg.ebp, dbg.esp);
- }
-
- for (i = 0; i < nr_mce_banks; i++) {
- rdmsr(MSR_IA32_MC0_STATUS+i*4, low, high);
- if (high & (1<<31)) {
- char misc[20];
- char addr[24];
-
- misc[0] = addr[0] = '\0';
- if (high & (1<<29))
- recover |= 1;
- if (high & (1<<25))
- recover |= 2;
- high &= ~(1<<31);
- if (high & (1<<27)) {
- rdmsr(MSR_IA32_MC0_MISC+i*4, alow, ahigh);
- snprintf(misc, 20, "[%08x%08x]", ahigh, alow);
- }
- if (high & (1<<26)) {
- rdmsr(MSR_IA32_MC0_ADDR+i*4, alow, ahigh);
- snprintf(addr, 24, " at %08x%08x", ahigh, alow);
- }
- printk(KERN_EMERG "CPU %d: Bank %d: %08x%08x%s%s\n",
- smp_processor_id(), i, high, low, misc, addr);
- }
- }
-
- if (recover & 2)
- panic("CPU context corrupt");
- if (recover & 1)
- panic("Unable to continue");
-
- printk(KERN_EMERG "Attempting to continue.\n");
-
- /*
- * Do not clear the MSR_IA32_MCi_STATUS if the error is not
- * recoverable/continuable.This will allow BIOS to look at the MSRs
- * for errors if the OS could not log the error.
- */
- for (i = 0; i < nr_mce_banks; i++) {
- u32 msr;
- msr = MSR_IA32_MC0_STATUS+i*4;
- rdmsr(msr, low, high);
- if (high&(1<<31)) {
- /* Clear it */
- wrmsr(msr, 0UL, 0UL);
- /* Serialize */
- wmb();
- add_taint(TAINT_MACHINE_CHECK);
- }
- }
- mcgstl &= ~(1<<2);
- wrmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
-}
-
-void intel_p4_mcheck_init(struct cpuinfo_x86 *c)
-{
- u32 l, h;
- int i;
-
- machine_check_vector = intel_machine_check;
- wmb();
-
- printk(KERN_INFO "Intel machine check architecture supported.\n");
- rdmsr(MSR_IA32_MCG_CAP, l, h);
- if (l & (1<<8)) /* Control register present ? */
- wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
- nr_mce_banks = l & 0xff;
-
- for (i = 0; i < nr_mce_banks; i++) {
- wrmsr(MSR_IA32_MC0_CTL+4*i, 0xffffffff, 0xffffffff);
- wrmsr(MSR_IA32_MC0_STATUS+4*i, 0x0, 0x0);
- }
-
- set_in_cr4(X86_CR4_MCE);
- printk(KERN_INFO "Intel machine check reporting enabled on CPU#%d.\n",
- smp_processor_id());
-
- /* Check for P4/Xeon extended MCE MSRs */
- rdmsr(MSR_IA32_MCG_CAP, l, h);
- if (l & (1<<9)) {/* MCG_EXT_P */
- mce_num_extended_msrs = (l >> 16) & 0xff;
- printk(KERN_INFO "CPU%d: Intel P4/Xeon Extended MCE MSRs (%d)"
- " available\n",
- smp_processor_id(), mce_num_extended_msrs);
-
-#ifdef CONFIG_X86_MCE_P4THERMAL
- /* Check for P4/Xeon Thermal monitor */
- intel_init_thermal(c);
-#endif
- }
-}
Index: linux/arch/x86/kernel/cpu/mcheck/p6.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/p6.c
+++ /dev/null
@@ -1,127 +0,0 @@
-/*
- * P6 specific Machine Check Exception Reporting
- * (C) Copyright 2002 Alan Cox <alan@lxorguk.ukuu.org.uk>
- */
-#include <linux/interrupt.h>
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/init.h>
-#include <linux/smp.h>
-
-#include <asm/processor.h>
-#include <asm/system.h>
-#include <asm/mce.h>
-#include <asm/msr.h>
-
-/* Machine Check Handler For PII/PIII */
-static void intel_machine_check(struct pt_regs *regs, long error_code)
-{
- u32 alow, ahigh, high, low;
- u32 mcgstl, mcgsth;
- int recover = 1;
- int i;
-
- rdmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
- if (mcgstl & (1<<0)) /* Recoverable ? */
- recover = 0;
-
- printk(KERN_EMERG "CPU %d: Machine Check Exception: %08x%08x\n",
- smp_processor_id(), mcgsth, mcgstl);
-
- for (i = 0; i < nr_mce_banks; i++) {
- rdmsr(MSR_IA32_MC0_STATUS+i*4, low, high);
- if (high & (1<<31)) {
- char misc[20];
- char addr[24];
-
- misc[0] = '\0';
- addr[0] = '\0';
-
- if (high & (1<<29))
- recover |= 1;
- if (high & (1<<25))
- recover |= 2;
- high &= ~(1<<31);
-
- if (high & (1<<27)) {
- rdmsr(MSR_IA32_MC0_MISC+i*4, alow, ahigh);
- snprintf(misc, 20, "[%08x%08x]", ahigh, alow);
- }
- if (high & (1<<26)) {
- rdmsr(MSR_IA32_MC0_ADDR+i*4, alow, ahigh);
- snprintf(addr, 24, " at %08x%08x", ahigh, alow);
- }
-
- printk(KERN_EMERG "CPU %d: Bank %d: %08x%08x%s%s\n",
- smp_processor_id(), i, high, low, misc, addr);
- }
- }
-
- if (recover & 2)
- panic("CPU context corrupt");
- if (recover & 1)
- panic("Unable to continue");
-
- printk(KERN_EMERG "Attempting to continue.\n");
- /*
- * Do not clear the MSR_IA32_MCi_STATUS if the error is not
- * recoverable/continuable.This will allow BIOS to look at the MSRs
- * for errors if the OS could not log the error:
- */
- for (i = 0; i < nr_mce_banks; i++) {
- unsigned int msr;
-
- msr = MSR_IA32_MC0_STATUS+i*4;
- rdmsr(msr, low, high);
- if (high & (1<<31)) {
- /* Clear it: */
- wrmsr(msr, 0UL, 0UL);
- /* Serialize: */
- wmb();
- add_taint(TAINT_MACHINE_CHECK);
- }
- }
- mcgstl &= ~(1<<2);
- wrmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
-}
-
-/* Set up machine check reporting for processors with Intel style MCE: */
-void intel_p6_mcheck_init(struct cpuinfo_x86 *c)
-{
- u32 l, h;
- int i;
-
- /* Check for MCE support */
- if (!cpu_has(c, X86_FEATURE_MCE))
- return;
-
- /* Check for PPro style MCA */
- if (!cpu_has(c, X86_FEATURE_MCA))
- return;
-
- /* Ok machine check is available */
- machine_check_vector = intel_machine_check;
- /* Make sure the vector pointer is visible before we enable MCEs: */
- wmb();
-
- printk(KERN_INFO "Intel machine check architecture supported.\n");
- rdmsr(MSR_IA32_MCG_CAP, l, h);
- if (l & (1<<8)) /* Control register present ? */
- wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
- nr_mce_banks = l & 0xff;
-
- /*
- * Following the example in IA-32 SDM Vol 3:
- * - MC0_CTL should not be written
- * - Status registers on all banks should be cleared on reset
- */
- for (i = 1; i < nr_mce_banks; i++)
- wrmsr(MSR_IA32_MC0_CTL+4*i, 0xffffffff, 0xffffffff);
-
- for (i = 0; i < nr_mce_banks; i++)
- wrmsr(MSR_IA32_MC0_STATUS+4*i, 0x0, 0x0);
-
- set_in_cr4(X86_CR4_MCE);
- printk(KERN_INFO "Intel machine check reporting enabled on CPU#%d.\n",
- smp_processor_id());
-}
Index: linux/arch/x86/kernel/cpu/mcheck/non-fatal.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/non-fatal.c
+++ /dev/null
@@ -1,94 +0,0 @@
-/*
- * Non Fatal Machine Check Exception Reporting
- *
- * (C) Copyright 2002 Dave Jones. <davej@redhat.com>
- *
- * This file contains routines to check for non-fatal MCEs every 15s
- *
- */
-#include <linux/interrupt.h>
-#include <linux/workqueue.h>
-#include <linux/jiffies.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/types.h>
-#include <linux/init.h>
-#include <linux/smp.h>
-
-#include <asm/processor.h>
-#include <asm/system.h>
-#include <asm/mce.h>
-#include <asm/msr.h>
-
-static int firstbank;
-
-#define MCE_RATE (15*HZ) /* timer rate is 15s */
-
-static void mce_checkregs(void *info)
-{
- u32 low, high;
- int i;
-
- for (i = firstbank; i < nr_mce_banks; i++) {
- rdmsr(MSR_IA32_MC0_STATUS+i*4, low, high);
-
- if (!(high & (1<<31)))
- continue;
-
- printk(KERN_INFO "MCE: The hardware reports a non fatal, "
- "correctable incident occurred on CPU %d.\n",
- smp_processor_id());
-
- printk(KERN_INFO "Bank %d: %08x%08x\n", i, high, low);
-
- /*
- * Scrub the error so we don't pick it up in MCE_RATE
- * seconds time:
- */
- wrmsr(MSR_IA32_MC0_STATUS+i*4, 0UL, 0UL);
-
- /* Serialize: */
- wmb();
- add_taint(TAINT_MACHINE_CHECK);
- }
-}
-
-static void mce_work_fn(struct work_struct *work);
-static DECLARE_DELAYED_WORK(mce_work, mce_work_fn);
-
-static void mce_work_fn(struct work_struct *work)
-{
- on_each_cpu(mce_checkregs, NULL, 1);
- schedule_delayed_work(&mce_work, round_jiffies_relative(MCE_RATE));
-}
-
-static int __init init_nonfatal_mce_checker(void)
-{
- struct cpuinfo_x86 *c = &boot_cpu_data;
-
- /* Check for MCE support */
- if (!cpu_has(c, X86_FEATURE_MCE))
- return -ENODEV;
-
- /* Check for PPro style MCA */
- if (!cpu_has(c, X86_FEATURE_MCA))
- return -ENODEV;
-
- /* Some Athlons misbehave when we frob bank 0 */
- if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
- boot_cpu_data.x86 == 6)
- firstbank = 1;
- else
- firstbank = 0;
-
- /*
- * Check for non-fatal errors every MCE_RATE s
- */
- schedule_delayed_work(&mce_work, round_jiffies_relative(MCE_RATE));
- printk(KERN_INFO "Machine check exception polling timer started.\n");
-
- return 0;
-}
-module_init(init_nonfatal_mce_checker);
-
-MODULE_LICENSE("GPL");
Index: linux/Documentation/feature-removal-schedule.txt
===================================================================
--- linux.orig/Documentation/feature-removal-schedule.txt
+++ linux/Documentation/feature-removal-schedule.txt
@@ -448,13 +448,3 @@ What: CONFIG_RFKILL_INPUT
When: 2.6.33
Why: Should be implemented in userspace, policy daemon.
Who: Johannes Berg <johannes@sipsolutions.net>
-
-----------------------------
-
-What: CONFIG_X86_OLD_MCE
-When: 2.6.32
-Why: Remove the old legacy 32bit machine check code. This has been
- superseded by the newer machine check code from the 64bit port,
- but the old version has been kept around for easier testing. Note this
- doesn't impact the old P5 and WinChip machine check handlers.
-Who: Andi Kleen <andi@firstfloor.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [4/8] x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
` (2 preceding siblings ...)
2009-07-08 22:31 ` [PATCH] [3/8] x86: mce: Remove old i386 machine check code Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [5/8] x86: mce: Move code in mce.c Andi Kleen
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
Drop the CONFIG_X86_NEW_MCE symbol and change all
references to it to check for CONFIG_X86_MCE directly.
No code changes
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/Kconfig | 11 +++--------
arch/x86/include/asm/entry_arch.h | 2 +-
arch/x86/kernel/apic/nmi.c | 2 +-
arch/x86/kernel/cpu/mcheck/Makefile | 3 +--
arch/x86/kernel/irq.c | 4 ++--
arch/x86/kernel/irqinit.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
7 files changed, 10 insertions(+), 16 deletions(-)
Index: linux/arch/x86/Kconfig
===================================================================
--- linux.orig/arch/x86/Kconfig
+++ linux/arch/x86/Kconfig
@@ -784,15 +784,10 @@ config X86_MCE
The action the kernel takes depends on the severity of the problem,
ranging from warning messages to halting the machine.
-config X86_NEW_MCE
- depends on X86_MCE
- bool
- default y
-
config X86_MCE_INTEL
def_bool y
prompt "Intel MCE features"
- depends on X86_NEW_MCE && X86_LOCAL_APIC
+ depends on X86_MCE && X86_LOCAL_APIC
---help---
Additional support for intel specific MCE features such as
the thermal monitor.
@@ -800,7 +795,7 @@ config X86_MCE_INTEL
config X86_MCE_AMD
def_bool y
prompt "AMD MCE features"
- depends on X86_NEW_MCE && X86_LOCAL_APIC
+ depends on X86_MCE && X86_LOCAL_APIC
---help---
Additional support for AMD specific MCE features such as
the DRAM Error Threshold.
@@ -820,7 +815,7 @@ config X86_MCE_THRESHOLD
default y
config X86_MCE_INJECT
- depends on X86_NEW_MCE
+ depends on X86_MCE
tristate "Machine check injector support"
---help---
Provide support for injecting machine checks for testing purposes.
Index: linux/arch/x86/include/asm/entry_arch.h
===================================================================
--- linux.orig/arch/x86/include/asm/entry_arch.h
+++ linux/arch/x86/include/asm/entry_arch.h
@@ -61,7 +61,7 @@ BUILD_INTERRUPT(thermal_interrupt,THERMA
BUILD_INTERRUPT(threshold_interrupt,THRESHOLD_APIC_VECTOR)
#endif
-#ifdef CONFIG_X86_NEW_MCE
+#ifdef CONFIG_X86_MCE
BUILD_INTERRUPT(mce_self_interrupt,MCE_SELF_VECTOR)
#endif
Index: linux/arch/x86/kernel/cpu/mcheck/Makefile
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/Makefile
+++ linux/arch/x86/kernel/cpu/mcheck/Makefile
@@ -1,6 +1,5 @@
-obj-y = mce.o
+obj-y = mce.o mce-severity.o
-obj-$(CONFIG_X86_NEW_MCE) += mce-severity.o
obj-$(CONFIG_X86_ANCIENT_MCE) += winchip.o p5.o
obj-$(CONFIG_X86_MCE_INTEL) += mce_intel.o
obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o
Index: linux/arch/x86/kernel/apic/nmi.c
===================================================================
--- linux.orig/arch/x86/kernel/apic/nmi.c
+++ linux/arch/x86/kernel/apic/nmi.c
@@ -66,7 +66,7 @@ static inline unsigned int get_nmi_count
static inline int mce_in_progress(void)
{
-#if defined(CONFIG_X86_NEW_MCE)
+#if defined(CONFIG_X86_MCE)
return atomic_read(&mce_entry) > 0;
#endif
return 0;
Index: linux/arch/x86/kernel/irq.c
===================================================================
--- linux.orig/arch/x86/kernel/irq.c
+++ linux/arch/x86/kernel/irq.c
@@ -104,7 +104,7 @@ static int show_other_interrupts(struct
seq_printf(p, " Threshold APIC interrupts\n");
# endif
#endif
-#ifdef CONFIG_X86_NEW_MCE
+#ifdef CONFIG_X86_MCE
seq_printf(p, "%*s: ", prec, "MCE");
for_each_online_cpu(j)
seq_printf(p, "%10u ", per_cpu(mce_exception_count, j));
@@ -200,7 +200,7 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
sum += irq_stats(cpu)->irq_threshold_count;
# endif
#endif
-#ifdef CONFIG_X86_NEW_MCE
+#ifdef CONFIG_X86_MCE
sum += per_cpu(mce_exception_count, cpu);
sum += per_cpu(mce_poll_count, cpu);
#endif
Index: linux/arch/x86/kernel/irqinit.c
===================================================================
--- linux.orig/arch/x86/kernel/irqinit.c
+++ linux/arch/x86/kernel/irqinit.c
@@ -190,7 +190,7 @@ static void __init apic_intr_init(void)
#ifdef CONFIG_X86_THRESHOLD
alloc_intr_gate(THRESHOLD_APIC_VECTOR, threshold_interrupt);
#endif
-#if defined(CONFIG_X86_NEW_MCE) && defined(CONFIG_X86_LOCAL_APIC)
+#if defined(CONFIG_X86_MCE) && defined(CONFIG_X86_LOCAL_APIC)
alloc_intr_gate(MCE_SELF_VECTOR, mce_self_interrupt);
#endif
Index: linux/arch/x86/kernel/signal.c
===================================================================
--- linux.orig/arch/x86/kernel/signal.c
+++ linux/arch/x86/kernel/signal.c
@@ -856,7 +856,7 @@ static void do_signal(struct pt_regs *re
void
do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
{
-#ifdef CONFIG_X86_NEW_MCE
+#ifdef CONFIG_X86_MCE
/* notify userspace of pending MCEs */
if (thread_info_flags & _TIF_MCE_NOTIFY)
mce_notify_process();
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [5/8] x86: mce: Move code in mce.c
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
` (3 preceding siblings ...)
2009-07-08 22:31 ` [PATCH] [4/8] x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [6/8] x86: mce: Move per bank data in a single datastructure Andi Kleen
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
Now that the X86_OLD_MCE ifdefs are gone move some code that
used to be outside the big ifdef to a more natural place
near its user.
No code change.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/cpu/mcheck/mce.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c
@@ -45,17 +45,6 @@
#include "mce-internal.h"
-/* Handle unconfigured int18 (should never happen) */
-static void unexpected_machine_check(struct pt_regs *regs, long error_code)
-{
- printk(KERN_ERR "CPU#%d: Unexpected int18 (Machine Check).\n",
- smp_processor_id());
-}
-
-/* Call the installed machine check handler for this CPU setup. */
-void (*machine_check_vector)(struct pt_regs *, long error_code) =
- unexpected_machine_check;
-
int mce_disabled __read_mostly;
#define MISC_MCELOG_MINOR 227
@@ -1322,6 +1311,17 @@ static void mce_init_timer(void)
add_timer(t);
}
+/* Handle unconfigured int18 (should never happen) */
+static void unexpected_machine_check(struct pt_regs *regs, long error_code)
+{
+ printk(KERN_ERR "CPU#%d: Unexpected int18 (Machine Check).\n",
+ smp_processor_id());
+}
+
+/* Call the installed machine check handler for this CPU setup. */
+void (*machine_check_vector)(struct pt_regs *, long error_code) =
+ unexpected_machine_check;
+
/*
* Called for each booted CPU to set up machine checks.
* Must be called with preempt off:
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [6/8] x86: mce: Move per bank data in a single datastructure
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
` (4 preceding siblings ...)
2009-07-08 22:31 ` [PATCH] [5/8] x86: mce: Move code in mce.c Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [7/8] x86: mce: macros to compute banks MSRs Andi Kleen
2009-07-08 22:31 ` [PATCH] [8/8] x86: mce: Lower maximum number of banks to architecture limit Andi Kleen
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
This addresses one of the leftover review comments.
Move the per bank data into a single structure. This avoids
several separate variables and also separate allocation of sysfs objects.
I didn't move the CMCI ownership information so far because
that would have needed some non trivial changes in the algorithms.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/cpu/mcheck/mce-internal.h | 14 +++
arch/x86/kernel/cpu/mcheck/mce.c | 109 ++++++++++++++----------------
2 files changed, 67 insertions(+), 56 deletions(-)
Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c
@@ -64,7 +64,6 @@ DEFINE_PER_CPU(unsigned, mce_exception_c
*/
static int tolerant __read_mostly = 1;
static int banks __read_mostly;
-static u64 *bank __read_mostly;
static int rip_msr __read_mostly;
static int mce_bootlog __read_mostly = -1;
static int monarch_timeout __read_mostly = -1;
@@ -74,13 +73,13 @@ int mce_cmci_disabled __read_mostly;
int mce_ignore_ce __read_mostly;
int mce_ser __read_mostly;
+struct mce_bank *mce_banks __read_mostly;
+
/* User mode helper program triggered by machine check event */
static unsigned long mce_need_notify;
static char mce_helper[128];
static char *mce_helper_argv[2] = { mce_helper, NULL };
-static unsigned long dont_init_banks;
-
static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
static DEFINE_PER_CPU(struct mce, mces_seen);
static int cpu_missing;
@@ -91,11 +90,6 @@ DEFINE_PER_CPU(mce_banks_t, mce_poll_ban
[0 ... BITS_TO_LONGS(MAX_NR_BANKS)-1] = ~0UL
};
-static inline int skip_bank_init(int i)
-{
- return i < BITS_PER_LONG && test_bit(i, &dont_init_banks);
-}
-
static DEFINE_PER_CPU(struct work_struct, mce_work);
/* Do initial initialization of a struct mce */
@@ -482,7 +476,7 @@ void machine_check_poll(enum mcp_flags f
m.mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
for (i = 0; i < banks; i++) {
- if (!bank[i] || !test_bit(i, *b))
+ if (!mce_banks[i].ctl || !test_bit(i, *b))
continue;
m.misc = 0;
@@ -903,7 +897,7 @@ void do_machine_check(struct pt_regs *re
order = mce_start(&no_way_out);
for (i = 0; i < banks; i++) {
__clear_bit(i, toclear);
- if (!bank[i])
+ if (!mce_banks[i].ctl)
continue;
m.misc = 0;
@@ -1146,6 +1140,21 @@ int mce_notify_irq(void)
}
EXPORT_SYMBOL_GPL(mce_notify_irq);
+static int mce_banks_init(void)
+{
+ int i;
+
+ mce_banks = kzalloc(banks * sizeof(struct mce_bank), GFP_KERNEL);
+ if (!mce_banks)
+ return -ENOMEM;
+ for (i = 0; i < banks; i++) {
+ struct mce_bank *b = &mce_banks[i];
+ b->ctl = -1ULL;
+ b->init = 1;
+ }
+ return 0;
+}
+
/*
* Initialize Machine Checks for a CPU.
*/
@@ -1169,11 +1178,10 @@ static int mce_cap_init(void)
/* Don't support asymmetric configurations today */
WARN_ON(banks != 0 && b != banks);
banks = b;
- if (!bank) {
- bank = kmalloc(banks * sizeof(u64), GFP_KERNEL);
- if (!bank)
- return -ENOMEM;
- memset(bank, 0xff, banks * sizeof(u64));
+ if (!mce_banks) {
+ int err = mce_banks_init();
+ if (err)
+ return err;
}
/* Use accurate RIP reporting if available. */
@@ -1205,9 +1213,10 @@ static void mce_init(void)
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
for (i = 0; i < banks; i++) {
- if (skip_bank_init(i))
+ struct mce_bank *b = &mce_banks[i];
+ if (!b->init)
continue;
- wrmsrl(MSR_IA32_MC0_CTL+4*i, bank[i]);
+ wrmsrl(MSR_IA32_MC0_CTL+4*i, b->ctl);
wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
}
}
@@ -1223,7 +1232,7 @@ static void mce_cpu_quirks(struct cpuinf
* trips off incorrectly with the IOMMU & 3ware
* & Cerberus:
*/
- clear_bit(10, (unsigned long *)&bank[4]);
+ clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
}
if (c->x86 <= 17 && mce_bootlog < 0) {
/*
@@ -1237,7 +1246,7 @@ static void mce_cpu_quirks(struct cpuinf
* by default.
*/
if (c->x86 == 6 && banks > 0)
- bank[0] = 0;
+ mce_banks[0].ctl = 0;
}
if (c->x86_vendor == X86_VENDOR_INTEL) {
@@ -1250,8 +1259,8 @@ static void mce_cpu_quirks(struct cpuinf
* valid event later, merely don't write CTL0.
*/
- if (c->x86 == 6 && c->x86_model < 0x1A)
- __set_bit(0, &dont_init_banks);
+ if (c->x86 == 6 && c->x86_model < 0x1A && banks > 0)
+ mce_banks[0].init = 0;
/*
* All newer Intel systems support MCE broadcasting. Enable
@@ -1578,7 +1587,8 @@ static int mce_disable(void)
int i;
for (i = 0; i < banks; i++) {
- if (!skip_bank_init(i))
+ struct mce_bank *b = &mce_banks[i];
+ if (b->init)
wrmsrl(MSR_IA32_MC0_CTL + i*4, 0);
}
return 0;
@@ -1654,14 +1664,15 @@ DEFINE_PER_CPU(struct sys_device, mce_de
__cpuinitdata
void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu);
-static struct sysdev_attribute *bank_attrs;
+static inline struct mce_bank *attr_to_bank(struct sysdev_attribute *attr)
+{
+ return container_of(attr, struct mce_bank, attr);
+}
static ssize_t show_bank(struct sys_device *s, struct sysdev_attribute *attr,
char *buf)
{
- u64 b = bank[attr - bank_attrs];
-
- return sprintf(buf, "%llx\n", b);
+ return sprintf(buf, "%llx\n", attr_to_bank(attr)->ctl);
}
static ssize_t set_bank(struct sys_device *s, struct sysdev_attribute *attr,
@@ -1672,7 +1683,7 @@ static ssize_t set_bank(struct sys_devic
if (strict_strtoull(buf, 0, &new) < 0)
return -EINVAL;
- bank[attr - bank_attrs] = new;
+ attr_to_bank(attr)->ctl = new;
mce_restart();
return size;
@@ -1816,7 +1827,7 @@ static __cpuinit int mce_create_device(u
}
for (j = 0; j < banks; j++) {
err = sysdev_create_file(&per_cpu(mce_dev, cpu),
- &bank_attrs[j]);
+ &mce_banks[j].attr);
if (err)
goto error2;
}
@@ -1825,10 +1836,10 @@ static __cpuinit int mce_create_device(u
return 0;
error2:
while (--j >= 0)
- sysdev_remove_file(&per_cpu(mce_dev, cpu), &bank_attrs[j]);
+ sysdev_remove_file(&per_cpu(mce_dev, cpu), &mce_banks[j].attr);
error:
while (--i >= 0)
- sysdev_remove_file(&per_cpu(mce_dev, cpu), mce_attrs[i]);
+ sysdev_remove_file(&per_cpu(mce_dev, cpu), &mce_banks[i].attr);
sysdev_unregister(&per_cpu(mce_dev, cpu));
@@ -1846,7 +1857,7 @@ static __cpuinit void mce_remove_device(
sysdev_remove_file(&per_cpu(mce_dev, cpu), mce_attrs[i]);
for (i = 0; i < banks; i++)
- sysdev_remove_file(&per_cpu(mce_dev, cpu), &bank_attrs[i]);
+ sysdev_remove_file(&per_cpu(mce_dev, cpu), &mce_banks[i].attr);
sysdev_unregister(&per_cpu(mce_dev, cpu));
cpumask_clear_cpu(cpu, mce_dev_initialized);
@@ -1863,7 +1874,8 @@ static void mce_disable_cpu(void *h)
if (!(action & CPU_TASKS_FROZEN))
cmci_clear();
for (i = 0; i < banks; i++) {
- if (!skip_bank_init(i))
+ struct mce_bank *b = &mce_banks[i];
+ if (b->init)
wrmsrl(MSR_IA32_MC0_CTL + i*4, 0);
}
}
@@ -1879,8 +1891,9 @@ static void mce_reenable_cpu(void *h)
if (!(action & CPU_TASKS_FROZEN))
cmci_reenable();
for (i = 0; i < banks; i++) {
- if (!skip_bank_init(i))
- wrmsrl(MSR_IA32_MC0_CTL + i*4, bank[i]);
+ struct mce_bank *b = &mce_banks[i];
+ if (b->init)
+ wrmsrl(MSR_IA32_MC0_CTL + i*4, b->ctl);
}
}
@@ -1928,35 +1941,21 @@ static struct notifier_block mce_cpu_not
.notifier_call = mce_cpu_callback,
};
-static __init int mce_init_banks(void)
+static __init void mce_init_banks(void)
{
int i;
- bank_attrs = kzalloc(sizeof(struct sysdev_attribute) * banks,
- GFP_KERNEL);
- if (!bank_attrs)
- return -ENOMEM;
-
for (i = 0; i < banks; i++) {
- struct sysdev_attribute *a = &bank_attrs[i];
+ struct mce_bank *b = &mce_banks[i];
+ struct sysdev_attribute *a = &b->attr;
- a->attr.name = kasprintf(GFP_KERNEL, "bank%d", i);
- if (!a->attr.name)
- goto nomem;
+ a->attr.name = b->attrname;
+ snprintf(b->attrname, ATTR_LEN, "bank%d", i);
a->attr.mode = 0644;
a->show = show_bank;
a->store = set_bank;
}
- return 0;
-
-nomem:
- while (--i >= 0)
- kfree(bank_attrs[i].attr.name);
- kfree(bank_attrs);
- bank_attrs = NULL;
-
- return -ENOMEM;
}
static __init int mce_init_device(void)
@@ -1969,9 +1968,7 @@ static __init int mce_init_device(void)
zalloc_cpumask_var(&mce_dev_initialized, GFP_KERNEL);
- err = mce_init_banks();
- if (err)
- return err;
+ mce_init_banks();
err = sysdev_class_register(&mce_sysclass);
if (err)
Index: linux/arch/x86/kernel/cpu/mcheck/mce-internal.h
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ linux/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -1,3 +1,4 @@
+#include <linux/sysdev.h>
#include <asm/mce.h>
enum severity_level {
@@ -10,6 +11,19 @@ enum severity_level {
MCE_PANIC_SEVERITY,
};
+#define ATTR_LEN 16
+
+/* One object for each MCE bank, shared by all CPUs */
+struct mce_bank {
+ u64 ctl; /* subevents to enable */
+ unsigned char init; /* initialise bank? */
+ struct sysdev_attribute attr; /* sysdev attribute */
+ char attrname[ATTR_LEN]; /* attribute name */
+};
+
int mce_severity(struct mce *a, int tolerant, char **msg);
extern int mce_ser;
+
+extern struct mce_bank *mce_banks;
+
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [7/8] x86: mce: macros to compute banks MSRs
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
` (5 preceding siblings ...)
2009-07-08 22:31 ` [PATCH] [6/8] x86: mce: Move per bank data in a single datastructure Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
2009-07-08 22:31 ` [PATCH] [8/8] x86: mce: Lower maximum number of banks to architecture limit Andi Kleen
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
Instead of open coded calculations for bank MSRs hide the indexing of higher
banks MCE register MSRs in new macros.
No semantic changes.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/include/asm/msr-index.h | 7 ++++++
arch/x86/kernel/cpu/mcheck/mce.c | 34 ++++++++++++++++-----------------
arch/x86/kernel/cpu/mcheck/mce_intel.c | 10 ++++-----
3 files changed, 29 insertions(+), 22 deletions(-)
Index: linux/arch/x86/include/asm/msr-index.h
===================================================================
--- linux.orig/arch/x86/include/asm/msr-index.h
+++ linux/arch/x86/include/asm/msr-index.h
@@ -81,8 +81,15 @@
#define MSR_IA32_MC0_ADDR 0x00000402
#define MSR_IA32_MC0_MISC 0x00000403
+#define MSR_IA32_MCx_CTL(x) (MSR_IA32_MC0_CTL + 4*(x))
+#define MSR_IA32_MCx_STATUS(x) (MSR_IA32_MC0_STATUS + 4*(x))
+#define MSR_IA32_MCx_ADDR(x) (MSR_IA32_MC0_ADDR + 4*(x))
+#define MSR_IA32_MCx_MISC(x) (MSR_IA32_MC0_MISC + 4*(x))
+
/* These are consecutive and not in the normal 4er MCE bank block */
#define MSR_IA32_MC0_CTL2 0x00000280
+#define MSR_IA32_MCx_CTL2(x) (MSR_IA32_MC0_CTL2 + (x))
+
#define CMCI_EN (1ULL << 30)
#define CMCI_THRESHOLD_MASK 0xffffULL
Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c
@@ -267,11 +267,11 @@ static int msr_to_offset(u32 msr)
unsigned bank = __get_cpu_var(injectm.bank);
if (msr == rip_msr)
return offsetof(struct mce, ip);
- if (msr == MSR_IA32_MC0_STATUS + bank*4)
+ if (msr == MSR_IA32_MCx_STATUS(bank))
return offsetof(struct mce, status);
- if (msr == MSR_IA32_MC0_ADDR + bank*4)
+ if (msr == MSR_IA32_MCx_ADDR(bank))
return offsetof(struct mce, addr);
- if (msr == MSR_IA32_MC0_MISC + bank*4)
+ if (msr == MSR_IA32_MCx_MISC(bank))
return offsetof(struct mce, misc);
if (msr == MSR_IA32_MCG_STATUS)
return offsetof(struct mce, mcgstatus);
@@ -485,7 +485,7 @@ void machine_check_poll(enum mcp_flags f
m.tsc = 0;
barrier();
- m.status = mce_rdmsrl(MSR_IA32_MC0_STATUS + i*4);
+ m.status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
if (!(m.status & MCI_STATUS_VAL))
continue;
@@ -500,9 +500,9 @@ void machine_check_poll(enum mcp_flags f
continue;
if (m.status & MCI_STATUS_MISCV)
- m.misc = mce_rdmsrl(MSR_IA32_MC0_MISC + i*4);
+ m.misc = mce_rdmsrl(MSR_IA32_MCx_MISC(i));
if (m.status & MCI_STATUS_ADDRV)
- m.addr = mce_rdmsrl(MSR_IA32_MC0_ADDR + i*4);
+ m.addr = mce_rdmsrl(MSR_IA32_MCx_ADDR(i));
if (!(flags & MCP_TIMESTAMP))
m.tsc = 0;
@@ -518,7 +518,7 @@ void machine_check_poll(enum mcp_flags f
/*
* Clear state for this bank.
*/
- mce_wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
+ mce_wrmsrl(MSR_IA32_MCx_STATUS(i), 0);
}
/*
@@ -539,7 +539,7 @@ static int mce_no_way_out(struct mce *m,
int i;
for (i = 0; i < banks; i++) {
- m->status = mce_rdmsrl(MSR_IA32_MC0_STATUS + i*4);
+ m->status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
if (mce_severity(m, tolerant, msg) >= MCE_PANIC_SEVERITY)
return 1;
}
@@ -823,7 +823,7 @@ static void mce_clear_state(unsigned lon
for (i = 0; i < banks; i++) {
if (test_bit(i, toclear))
- mce_wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
+ mce_wrmsrl(MSR_IA32_MCx_STATUS(i), 0);
}
}
@@ -904,7 +904,7 @@ void do_machine_check(struct pt_regs *re
m.addr = 0;
m.bank = i;
- m.status = mce_rdmsrl(MSR_IA32_MC0_STATUS + i*4);
+ m.status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
if ((m.status & MCI_STATUS_VAL) == 0)
continue;
@@ -945,9 +945,9 @@ void do_machine_check(struct pt_regs *re
kill_it = 1;
if (m.status & MCI_STATUS_MISCV)
- m.misc = mce_rdmsrl(MSR_IA32_MC0_MISC + i*4);
+ m.misc = mce_rdmsrl(MSR_IA32_MCx_MISC(i));
if (m.status & MCI_STATUS_ADDRV)
- m.addr = mce_rdmsrl(MSR_IA32_MC0_ADDR + i*4);
+ m.addr = mce_rdmsrl(MSR_IA32_MCx_ADDR(i));
/*
* Action optional error. Queue address for later processing.
@@ -1216,8 +1216,8 @@ static void mce_init(void)
struct mce_bank *b = &mce_banks[i];
if (!b->init)
continue;
- wrmsrl(MSR_IA32_MC0_CTL+4*i, b->ctl);
- wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
+ wrmsrl(MSR_IA32_MCx_CTL(i), b->ctl);
+ wrmsrl(MSR_IA32_MCx_STATUS(i), 0);
}
}
@@ -1589,7 +1589,7 @@ static int mce_disable(void)
for (i = 0; i < banks; i++) {
struct mce_bank *b = &mce_banks[i];
if (b->init)
- wrmsrl(MSR_IA32_MC0_CTL + i*4, 0);
+ wrmsrl(MSR_IA32_MCx_CTL(i), 0);
}
return 0;
}
@@ -1876,7 +1876,7 @@ static void mce_disable_cpu(void *h)
for (i = 0; i < banks; i++) {
struct mce_bank *b = &mce_banks[i];
if (b->init)
- wrmsrl(MSR_IA32_MC0_CTL + i*4, 0);
+ wrmsrl(MSR_IA32_MCx_CTL(i), 0);
}
}
@@ -1893,7 +1893,7 @@ static void mce_reenable_cpu(void *h)
for (i = 0; i < banks; i++) {
struct mce_bank *b = &mce_banks[i];
if (b->init)
- wrmsrl(MSR_IA32_MC0_CTL + i*4, b->ctl);
+ wrmsrl(MSR_IA32_MCx_CTL(i), b->ctl);
}
}
Index: linux/arch/x86/kernel/cpu/mcheck/mce_intel.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -90,7 +90,7 @@ static void cmci_discover(int banks, int
if (test_bit(i, owned))
continue;
- rdmsrl(MSR_IA32_MC0_CTL2 + i, val);
+ rdmsrl(MSR_IA32_MCx_CTL2(i), val);
/* Already owned by someone else? */
if (val & CMCI_EN) {
@@ -101,8 +101,8 @@ static void cmci_discover(int banks, int
}
val |= CMCI_EN | CMCI_THRESHOLD;
- wrmsrl(MSR_IA32_MC0_CTL2 + i, val);
- rdmsrl(MSR_IA32_MC0_CTL2 + i, val);
+ wrmsrl(MSR_IA32_MCx_CTL2(i), val);
+ rdmsrl(MSR_IA32_MCx_CTL2(i), val);
/* Did the enable bit stick? -- the bank supports CMCI */
if (val & CMCI_EN) {
@@ -152,9 +152,9 @@ void cmci_clear(void)
if (!test_bit(i, __get_cpu_var(mce_banks_owned)))
continue;
/* Disable CMCI */
- rdmsrl(MSR_IA32_MC0_CTL2 + i, val);
+ rdmsrl(MSR_IA32_MCx_CTL2(i), val);
val &= ~(CMCI_EN|CMCI_THRESHOLD_MASK);
- wrmsrl(MSR_IA32_MC0_CTL2 + i, val);
+ wrmsrl(MSR_IA32_MCx_CTL2(i), val);
__clear_bit(i, __get_cpu_var(mce_banks_owned));
}
spin_unlock_irqrestore(&cmci_discover_lock, flags);
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] [8/8] x86: mce: Lower maximum number of banks to architecture limit
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
` (6 preceding siblings ...)
2009-07-08 22:31 ` [PATCH] [7/8] x86: mce: macros to compute banks MSRs Andi Kleen
@ 2009-07-08 22:31 ` Andi Kleen
7 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-07-08 22:31 UTC (permalink / raw)
To: linux-kernel, x86
The Intel x86 architecture right now only supports 32 machine check
banks, more would bump into other MSRs.
So lower the max define to 32.
This only affects a few bitmaps, most data structures are dynamically
sized anyways.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/include/asm/mce.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux/arch/x86/include/asm/mce.h
===================================================================
--- linux.orig/arch/x86/include/asm/mce.h
+++ linux/arch/x86/include/asm/mce.h
@@ -130,10 +130,11 @@ void mce_log(struct mce *m);
DECLARE_PER_CPU(struct sys_device, mce_dev);
/*
- * To support more than 128 would need to escape the predefined
- * Linux defined extended banks first.
+ * Maximum banks number.
+ * This is the limit of the current register layout on
+ * Intel CPUs.
*/
-#define MAX_NR_BANKS (MCE_EXTENDED_BANK - 1)
+#define MAX_NR_BANKS 32
#ifdef CONFIG_X86_MCE_INTEL
extern int mce_cmci_disabled;
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-07-08 22:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-08 22:31 [PATCH] [0/8] Remove old deprecated x86 old machine check code & some cleanups Andi Kleen
2009-07-08 22:31 ` [PATCH] [1/8] x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE Andi Kleen
2009-07-08 22:31 ` [PATCH] [2/8] x86: mce: Update X86_MCE description in x86/Kconfig Andi Kleen
2009-07-08 22:31 ` [PATCH] [3/8] x86: mce: Remove old i386 machine check code Andi Kleen
2009-07-08 22:31 ` [PATCH] [4/8] x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE Andi Kleen
2009-07-08 22:31 ` [PATCH] [5/8] x86: mce: Move code in mce.c Andi Kleen
2009-07-08 22:31 ` [PATCH] [6/8] x86: mce: Move per bank data in a single datastructure Andi Kleen
2009-07-08 22:31 ` [PATCH] [7/8] x86: mce: macros to compute banks MSRs Andi Kleen
2009-07-08 22:31 ` [PATCH] [8/8] x86: mce: Lower maximum number of banks to architecture limit Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox