public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* backport AMD K7 MCE changes.
@ 2003-10-03 20:54 Dave Jones
  0 siblings, 0 replies; 2+ messages in thread
From: Dave Jones @ 2003-10-03 20:54 UTC (permalink / raw)
  To: Linux Kernel

We're still accessing MCE bank 0 on Athlons in 2.4 when we shouldn't.
(We don't enable it, but we still check it in the exception handler)
This is fixed differently in 2.6, but is a minimal change
to reproduce the same effect.

		Dave

--- linux-2.4.22/arch/i386/kernel/bluesmoke.c~	2003-10-03 21:47:11.000000000 +0100
+++ linux-2.4.22/arch/i386/kernel/bluesmoke.c	2003-10-03 21:49:55.000000000 +0100
@@ -16,6 +16,7 @@
  */
 
 static int banks;
+static int startbank;
 
 static void intel_machine_check(struct pt_regs * regs, long error_code)
 {
@@ -30,7 +31,7 @@
 
 	printk(KERN_EMERG "CPU %d: Machine Check Exception: %08x%08x\n", smp_processor_id(), mcgsth, mcgstl);
 	
-	for(i=0;i<banks;i++)
+	for(i=startbank;i<banks;i++)
 	{
 		rdmsr(MSR_IA32_MC0_STATUS+i*4,low, high);
 		if(high&(1<<31))
@@ -219,10 +220,13 @@
 	{
 		case X86_VENDOR_AMD:
 			/*
-			 *	AMD K7 machine check is Intel like
+			 *	AMD K7/K8 machine check is Intel like.
 			 */
-			if(c->x86 == 6 || c->x86 == 15)
+			
+			if(c->x86 == 6 || c->x86 == 15) {
+				startbank = 1;
 				intel_mcheck_init(c);
+			}
 			break;
 		case X86_VENDOR_INTEL:
 			intel_mcheck_init(c);

-- 
 Dave Jones     http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: backport AMD K7 MCE changes.
       [not found] <20031003205408.GA17829@redhat.com.suse.lists.linux.kernel>
@ 2003-10-03 21:08 ` Andi Kleen
  0 siblings, 0 replies; 2+ messages in thread
From: Andi Kleen @ 2003-10-03 21:08 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel

Dave Jones <davej@redhat.com> writes:
>  			 */
> -			if(c->x86 == 6 || c->x86 == 15)
> +			
> +			if(c->x86 == 6 || c->x86 == 15) {
> +				startbank = 1;

Can you please add comments to such magic changes ?

I still think we should not do anything without an official errata.

The K7 actually has two MCE enable registers: a mask that is only
supposed to be programmed by the BIOS and works around bugs and the
standardized IA32 register controlled by the OS. I suspect your case
only happens with some buggy BIOS that doesn't program the shadow mask
correctly. This would imply that it would make more sense to test it
and program it correctly based on a known good value.

-Andi

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-10-03 21:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-03 20:54 backport AMD K7 MCE changes Dave Jones
     [not found] <20031003205408.GA17829@redhat.com.suse.lists.linux.kernel>
2003-10-03 21:08 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox