* [PATCH -mm] x86 port lockless MCE quirky bank0
@ 2005-05-17 20:18 Hugh Dickins
2005-05-17 20:24 ` Andi Kleen
0 siblings, 1 reply; 4+ messages in thread
From: Hugh Dickins @ 2005-05-17 20:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Yu, Luming, Guo, Racing, Andi Kleen, Dave Jones, linux-kernel
I've been avoiding -mm on the Dell Latitude with PIII Mobile for several
weeks now, since APM resume from RAM locks up. Just stole time to bisect
and it's the x86-port-lockless-mce patches, which lack some of the wisdom
of ages handed down in the earlier MCE source.
The "Don't enable bank 0 on Intel P6 cores, it goes bang quickly" fix
from the old mcheck/p6.c solves my PIII problem; but looking further,
the old mcheck/k7.c says some Athlons also have a problem with bank 0.
So try to handle them both together: but this code is an amalgam of
what was done slightly differently in the two cases, diverging from each
- I don't want to add voodoo if it's unnecessary; and untested on AMD.
Worry: has more such wisdom got lost in the translation?
Signed-off-by: Hugh Dickins <hugh@veritas.com>
--- 2.6.12-rc4-mm2/arch/i386/kernel/cpu/mcheck/mce.c 2005-05-16 12:18:35.000000000 +0100
+++ linux/arch/i386/kernel/cpu/mcheck/mce.c 2005-05-17 20:21:17.000000000 +0100
@@ -30,7 +30,7 @@ int __devinitdata mce_dont_init = 0;
/* 0: always panic, 1: panic if deadlock possible, 2: try to avoid panic,
3: never panic or exit (for testing only) */
static unsigned long tolerant = 1;
-static int banks;
+static int banks, quirky_bank0;
static unsigned long bank[NR_BANKS] = { [0 ... NR_BANKS-1] = ~0UL };
static unsigned long console_logged;
static int notify_user;
@@ -320,7 +320,8 @@ static void mce_init(void *dummy)
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
for (i = 0; i < banks; i++) {
- wrmsrl(MSR_IA32_MC0_CTL+4*i, bank[i]);
+ if (i >= quirky_bank0)
+ wrmsrl(MSR_IA32_MC0_CTL+4*i, bank[i]);
wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
}
}
@@ -334,6 +335,14 @@ static void __devinit mce_cpu_quirks(str
incorrectly with the IOMMU & 3ware & Cerberus. */
clear_bit(10, &bank[4]);
}
+ if ((c->x86_vendor == X86_VENDOR_AMD ||
+ c->x86_vendor == X86_VENDOR_INTEL) && c->x86 == 6) {
+ /*
+ * Intel P6 cores go bang quickly when bank0 is enabled.
+ * Some Athlons cause spurious MCEs when bank0 is enabled.
+ */
+ quirky_bank0 = 1;
+ }
}
static void __devinit mce_cpu_features(struct cpuinfo_x86 *c)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH -mm] x86 port lockless MCE quirky bank0
2005-05-17 20:18 [PATCH -mm] x86 port lockless MCE quirky bank0 Hugh Dickins
@ 2005-05-17 20:24 ` Andi Kleen
2005-05-17 22:39 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2005-05-17 20:24 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Yu, Luming, Guo, Racing, Andi Kleen, Dave Jones,
linux-kernel
> + if ((c->x86_vendor == X86_VENDOR_AMD ||
> + c->x86_vendor == X86_VENDOR_INTEL) && c->x86 == 6) {
> + /*
> + * Intel P6 cores go bang quickly when bank0 is enabled.
> + * Some Athlons cause spurious MCEs when bank0 is enabled.
> + */
> + quirky_bank0 = 1;
> + }
That's wrong on K8 AMD machines at least. You need to check c->x86
there too.
And better would be to just do bank[0] = 0 instead of
adding the new variable.
-Andi
P.S.: Also Yu Luming can you please submit an updated patch that keeps mce.c
in arch/x86_64 like we discussed earlier?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH -mm] x86 port lockless MCE quirky bank0
2005-05-17 20:24 ` Andi Kleen
@ 2005-05-17 22:39 ` Andrew Morton
2005-05-18 12:34 ` Hugh Dickins
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2005-05-17 22:39 UTC (permalink / raw)
To: Andi Kleen; +Cc: hugh, luming.yu, racing.guo, ak, davej, linux-kernel
Andi Kleen <ak@suse.de> wrote:
>
> > + if ((c->x86_vendor == X86_VENDOR_AMD ||
> > + c->x86_vendor == X86_VENDOR_INTEL) && c->x86 == 6) {
> > + /*
> > + * Intel P6 cores go bang quickly when bank0 is enabled.
> > + * Some Athlons cause spurious MCEs when bank0 is enabled.
> > + */
> > + quirky_bank0 = 1;
> > + }
>
> That's wrong on K8 AMD machines at least. You need to check c->x86
> there too.
>
> And better would be to just do bank[0] = 0 instead of
> adding the new variable.
>
> -Andi
>
> P.S.: Also Yu Luming can you please submit an updated patch that keeps mce.c
> in arch/x86_64 like we discussed earlier?
Yes. I'll drop the following from -mm:
x86-port-lockless-mce-preparation.patch
x86-port-lockless-mce-implementation.patch
x86-port-lockless-mce-implementation-fix.patch
x86-port-lockless-mce-implementation-fix-2.patch
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH -mm] x86 port lockless MCE quirky bank0
2005-05-17 22:39 ` Andrew Morton
@ 2005-05-18 12:34 ` Hugh Dickins
0 siblings, 0 replies; 4+ messages in thread
From: Hugh Dickins @ 2005-05-18 12:34 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andi Kleen, luming.yu, racing.guo, davej, linux-kernel
On Tue, 17 May 2005, Andrew Morton wrote:
>
> Yes. I'll drop the following from -mm:
>
> x86-port-lockless-mce-preparation.patch
> x86-port-lockless-mce-implementation.patch
> x86-port-lockless-mce-implementation-fix.patch
> x86-port-lockless-mce-implementation-fix-2.patch
The right decision for now, I think - thanks.
I presume they may return later on, so I'd better confess:
I lied when I said my patch fixed the P6 bank0 issue, I was confused
between rebuildings and rebootings. It should have fixed the issue,
but the patch which actually fixed it was one earlier, which had an
off-by-one ("> quirky_bank0") which I'd corrected before posting.
The ">= quirky_bank0" patch was not enough to fix the issue because...
mce_cpu_quirks (and mce_init and mce_cpu_features) are never called at
startup (on Intel or Centaur), because the logic in machine_check_init
(see below) to call mcheck_init is broken.
Which explains why I got the freeze at resume not at startup,
and casts doubt on how much any of it has got tested so far.
Hugh
void __devinit machine_check_init(struct cpuinfo_x86 *c)
{
if (mce_dont_init)
return;
switch (c->x86_vendor) {
case X86_VENDOR_INTEL:
if (c->x86==5)
intel_p5_mcheck_init(c);
break;
case X86_VENDOR_CENTAUR:
if (c->x86==5)
winchip_mcheck_init(c);
break;
default:
machine_check_vector = do_machine_check;
wmb();
mcheck_init(c);
break;
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-05-18 12:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-17 20:18 [PATCH -mm] x86 port lockless MCE quirky bank0 Hugh Dickins
2005-05-17 20:24 ` Andi Kleen
2005-05-17 22:39 ` Andrew Morton
2005-05-18 12:34 ` Hugh Dickins
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.