* [PATCH] [0/4] Updates to the x86 machine check unification
@ 2008-08-05 17:17 Andi Kleen
2008-08-05 17:17 ` [PATCH] [1/4] MCE: Fix ifdefs for Intel thermal handler Andi Kleen
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Andi Kleen @ 2008-08-05 17:17 UTC (permalink / raw)
To: x86, linux-kernel
Should be applied to the respective x86 topic branch.
- Fix Intel thermal handling broken in earlier merge
- Add old code to the deprecation schedule
- Fix a long standing bug in thermal interrupt handling after
suspend to ram
- Revert older bogus patch
-Andi
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] [1/4] MCE: Fix ifdefs for Intel thermal handler
2008-08-05 17:17 [PATCH] [0/4] Updates to the x86 machine check unification Andi Kleen
@ 2008-08-05 17:17 ` Andi Kleen
2008-08-05 17:17 ` [PATCH] [2/4] MCE: Add old machine check code to feature-removal-schedule.txt Andi Kleen
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2008-08-05 17:17 UTC (permalink / raw)
To: x86, linux-kernel
I forgot to switch over some ifdefs during the machine check unification.
This broke the Intel thermal handler on 32bit with the new machine
check code. Fix this.
This patch should be ideally folded into the earlier conversion one
("Use 64bit machine check code on 32bit")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Index: linux/arch/x86/kernel/apic_32.c
===================================================================
--- linux.orig/arch/x86/kernel/apic_32.c
+++ linux/arch/x86/kernel/apic_32.c
@@ -700,7 +700,7 @@ void clear_local_APIC(void)
}
/* lets not touch this if we didn't frob it */
-#ifdef CONFIG_X86_MCE_P4THERMAL
+#if defined(CONFIG_X86_MCE_P4THERMAL) || defined(CONFIG_X86_MCE_INTEL)
if (maxlvt >= 5) {
v = apic_read(APIC_LVTTHMR);
apic_write_around(APIC_LVTTHMR, v | APIC_LVT_MASKED);
@@ -717,7 +717,7 @@ void clear_local_APIC(void)
if (maxlvt >= 4)
apic_write_around(APIC_LVTPC, APIC_LVT_MASKED);
-#ifdef CONFIG_X86_MCE_P4THERMAL
+#if defined(CONFIG_X86_MCE_P4THERMAL) || defined(CONFIG_X86_MCE_INTEL)
if (maxlvt >= 5)
apic_write_around(APIC_LVTTHMR, APIC_LVT_MASKED);
#endif
@@ -1377,7 +1377,7 @@ void __init apic_intr_init(void)
set_intr_gate(ERROR_APIC_VECTOR, error_interrupt);
/* thermal monitor LVT interrupt */
-#ifdef CONFIG_X86_MCE_P4THERMAL
+#if defined(CONFIG_X86_MCE_P4THERMAL) || defined(CONFIG_X86_MCE_INTEL)
set_intr_gate(THERMAL_APIC_VECTOR, thermal_interrupt);
#endif
}
@@ -1595,7 +1595,7 @@ static int lapic_suspend(struct sys_devi
apic_pm_state.apic_lvterr = apic_read(APIC_LVTERR);
apic_pm_state.apic_tmict = apic_read(APIC_TMICT);
apic_pm_state.apic_tdcr = apic_read(APIC_TDCR);
-#ifdef CONFIG_X86_MCE_P4THERMAL
+#if defined(CONFIG_X86_MCE_P4THERMAL) || defined(CONFIG_X86_MCE_INTEL)
if (maxlvt >= 5)
apic_pm_state.apic_thmr = apic_read(APIC_LVTTHMR);
#endif
Index: linux/include/asm-x86/mach-default/entry_arch.h
===================================================================
--- linux.orig/include/asm-x86/mach-default/entry_arch.h
+++ linux/include/asm-x86/mach-default/entry_arch.h
@@ -27,7 +27,7 @@ BUILD_INTERRUPT(apic_timer_interrupt,LOC
BUILD_INTERRUPT(error_interrupt,ERROR_APIC_VECTOR)
BUILD_INTERRUPT(spurious_interrupt,SPURIOUS_APIC_VECTOR)
-#ifdef CONFIG_X86_MCE_P4THERMAL
+#if defined(CONFIG_X86_MCE_P4THERMAL) || defined(CONFIG_X86_MCE_INTEL)
BUILD_INTERRUPT(thermal_interrupt,THERMAL_APIC_VECTOR)
#endif
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] [2/4] MCE: Add old machine check code to feature-removal-schedule.txt
2008-08-05 17:17 [PATCH] [0/4] Updates to the x86 machine check unification Andi Kleen
2008-08-05 17:17 ` [PATCH] [1/4] MCE: Fix ifdefs for Intel thermal handler Andi Kleen
@ 2008-08-05 17:17 ` Andi Kleen
2008-08-05 17:17 ` [PATCH] [3/4] MCE: Reinitialize per cpu features and ancient mces on resume Andi Kleen
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2008-08-05 17:17 UTC (permalink / raw)
To: x86, linux-kernel
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Index: linux/Documentation/feature-removal-schedule.txt
===================================================================
--- linux.orig/Documentation/feature-removal-schedule.txt
+++ linux/Documentation/feature-removal-schedule.txt
@@ -321,3 +321,13 @@ Why: This option was introduced just to
to keep working over the upgrade to 2.6.26. At the scheduled time of
removal fixed lm-sensors (2.x or 3.x) should be readily available.
Who: Rene Herman <rene.herman@gmail.com>
+
+----------------------------
+
+What: CONFIG_X86_OLD_MCE
+When: 2.6.29
+Why: Remove the old legacy 32bit machine check code. This has been superseded
+ by the 64bit machine check code, but the old version has been kept
+ around for easier testing. Note this doesn't impact the old P5 and WinChip
+ machine check handlers.
+Who: Andi Kleen <ak@linux.intel.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] [3/4] MCE: Reinitialize per cpu features and ancient mces on resume
2008-08-05 17:17 [PATCH] [0/4] Updates to the x86 machine check unification Andi Kleen
2008-08-05 17:17 ` [PATCH] [1/4] MCE: Fix ifdefs for Intel thermal handler Andi Kleen
2008-08-05 17:17 ` [PATCH] [2/4] MCE: Add old machine check code to feature-removal-schedule.txt Andi Kleen
@ 2008-08-05 17:17 ` Andi Kleen
2008-08-05 17:17 ` [PATCH] [4/4] MCE: Don't disable machine checks during code patching Andi Kleen
2008-08-08 22:23 ` [PATCH] [0/4] Updates to the x86 machine check unification H. Peter Anvin
4 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2008-08-05 17:17 UTC (permalink / raw)
To: x86, linux-kernel
This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again. Also the newly added ancient CPUs have the same problem.
Call the respective initialization functions on resume.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Index: linux/arch/x86/kernel/cpu/mcheck/mce_64.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -753,6 +753,8 @@ __setup("mce=", mcheck_enable);
static int mce_resume(struct sys_device *dev)
{
mce_init(NULL);
+ mce_ancient_init(¤t_cpu_data);
+ mce_cpu_features(¤t_cpu_data);
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] [4/4] MCE: Don't disable machine checks during code patching
2008-08-05 17:17 [PATCH] [0/4] Updates to the x86 machine check unification Andi Kleen
` (2 preceding siblings ...)
2008-08-05 17:17 ` [PATCH] [3/4] MCE: Reinitialize per cpu features and ancient mces on resume Andi Kleen
@ 2008-08-05 17:17 ` Andi Kleen
2008-08-08 22:23 ` [PATCH] [0/4] Updates to the x86 machine check unification H. Peter Anvin
4 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2008-08-05 17:17 UTC (permalink / raw)
To: x86, linux-kernel
This removes part of a a patch I added myself some time ago. After some
consideration the patch was a bad idea. In particular it stopped machine check
exceptions during code patching.
To quote the comment:
* MCEs only happen when something got corrupted and in this
* case we must do something about the corruption.
* Ignoring it is worse than a unlikely patching race.
* Also machine checks tend to be broadcast and if one CPU
* goes into machine check the others follow quickly, so we don't
* expect a machine check to cause undue problems during to code
* patching.
So undo the machine check related parts of
8f4e956b313dcccbc7be6f10808952345e3b638c NMIs are still disabled.
This only removes code, the only additions are a new comment.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/alternative.c | 17 +++++++++++------
arch/x86/kernel/cpu/mcheck/mce_32.c | 14 --------------
arch/x86/kernel/cpu/mcheck/mce_64.c | 14 --------------
include/asm-x86/mce.h | 2 --
4 files changed, 11 insertions(+), 36 deletions(-)
Index: linux/arch/x86/kernel/alternative.c
===================================================================
--- linux.orig/arch/x86/kernel/alternative.c
+++ linux/arch/x86/kernel/alternative.c
@@ -424,9 +424,17 @@ void __init alternative_instructions(voi
that might execute the to be patched code.
Other CPUs are not running. */
stop_nmi();
-#ifdef CONFIG_X86_MCE
- stop_mce();
-#endif
+
+ /*
+ * Don't stop machine check exceptions while patching.
+ * MCEs only happen when something got corrupted and in this
+ * case we must do something about the corruption.
+ * Ignoring it is worse than a unlikely patching race.
+ * Also machine checks tend to be broadcast and if one CPU
+ * goes into machine check the others follow quickly, so we don't
+ * expect a machine check to cause undue problems during to code
+ * patching.
+ */
apply_alternatives(__alt_instructions, __alt_instructions_end);
@@ -466,9 +474,6 @@ void __init alternative_instructions(voi
(unsigned long)__smp_locks_end);
restart_nmi();
-#ifdef CONFIG_X86_MCE
- restart_mce();
-#endif
}
/**
Index: linux/arch/x86/kernel/cpu/mcheck/mce_64.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -699,20 +699,6 @@ static struct miscdevice mce_log_device
&mce_chrdev_ops,
};
-static unsigned long old_cr4 __initdata;
-
-void __init stop_mce(void)
-{
- old_cr4 = read_cr4();
- clear_in_cr4(X86_CR4_MCE);
-}
-
-void __init restart_mce(void)
-{
- if (old_cr4 & X86_CR4_MCE)
- set_in_cr4(X86_CR4_MCE);
-}
-
/*
* Old style boot options parsing. Only for compatibility.
*/
Index: linux/include/asm-x86/mce.h
===================================================================
--- linux.orig/include/asm-x86/mce.h
+++ linux/include/asm-x86/mce.h
@@ -113,8 +113,6 @@ extern void mcheck_init(struct cpuinfo_x
#else
#define mcheck_init(c) do { } while (0)
#endif
-extern void stop_mce(void);
-extern void restart_mce(void);
#endif /* __KERNEL__ */
Index: linux/arch/x86/kernel/cpu/mcheck/mce_32.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_32.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce_32.c
@@ -61,20 +61,6 @@ void mcheck_init(struct cpuinfo_x86 *c)
}
}
-static unsigned long old_cr4 __initdata;
-
-void __init stop_mce(void)
-{
- old_cr4 = read_cr4();
- clear_in_cr4(X86_CR4_MCE);
-}
-
-void __init restart_mce(void)
-{
- if (old_cr4 & X86_CR4_MCE)
- set_in_cr4(X86_CR4_MCE);
-}
-
static int __init mcheck_disable(char *str)
{
mce_disabled = 1;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [0/4] Updates to the x86 machine check unification
2008-08-05 17:17 [PATCH] [0/4] Updates to the x86 machine check unification Andi Kleen
` (3 preceding siblings ...)
2008-08-05 17:17 ` [PATCH] [4/4] MCE: Don't disable machine checks during code patching Andi Kleen
@ 2008-08-08 22:23 ` H. Peter Anvin
4 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2008-08-08 22:23 UTC (permalink / raw)
To: Andi Kleen; +Cc: x86, linux-kernel
Andi Kleen wrote:
> Should be applied to the respective x86 topic branch.
>
> - Fix Intel thermal handling broken in earlier merge
> - Add old code to the deprecation schedule
> - Fix a long standing bug in thermal interrupt handling after
> suspend to ram
> - Revert older bogus patch
>
Applied to tip:x86/unify-mce, thanks.
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-08-08 22:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-05 17:17 [PATCH] [0/4] Updates to the x86 machine check unification Andi Kleen
2008-08-05 17:17 ` [PATCH] [1/4] MCE: Fix ifdefs for Intel thermal handler Andi Kleen
2008-08-05 17:17 ` [PATCH] [2/4] MCE: Add old machine check code to feature-removal-schedule.txt Andi Kleen
2008-08-05 17:17 ` [PATCH] [3/4] MCE: Reinitialize per cpu features and ancient mces on resume Andi Kleen
2008-08-05 17:17 ` [PATCH] [4/4] MCE: Don't disable machine checks during code patching Andi Kleen
2008-08-08 22:23 ` [PATCH] [0/4] Updates to the x86 machine check unification H. Peter Anvin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox