* x86 32-bit machine check handler
@ 2007-11-12 20:39 Max Asbock
2007-11-12 21:20 ` H. Peter Anvin
2007-11-13 14:15 ` Andi Kleen
0 siblings, 2 replies; 5+ messages in thread
From: Max Asbock @ 2007-11-12 20:39 UTC (permalink / raw)
To: lkml; +Cc: tglx, mingo, hpa
Now that the 32-bit and 64-bit x86 machine check handlers live next to
each other a certain asymmetry in functionality is apparent. Notably,
the 64-bit machine check handler implements a timer that periodically
polls for silent machine check errors and makes them accessible to user
space through /dev/mcelog. Are there reasons the x86 32-bit machine
check handler couldn't do the same?
thanks,
Max
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: x86 32-bit machine check handler
2007-11-12 20:39 x86 32-bit machine check handler Max Asbock
@ 2007-11-12 21:20 ` H. Peter Anvin
2007-11-13 14:15 ` Andi Kleen
1 sibling, 0 replies; 5+ messages in thread
From: H. Peter Anvin @ 2007-11-12 21:20 UTC (permalink / raw)
To: Max Asbock; +Cc: lkml, tglx, mingo
Max Asbock wrote:
> Now that the 32-bit and 64-bit x86 machine check handlers live next to
> each other a certain asymmetry in functionality is apparent. Notably,
> the 64-bit machine check handler implements a timer that periodically
> polls for silent machine check errors and makes them accessible to user
> space through /dev/mcelog. Are there reasons the x86 32-bit machine
> check handler couldn't do the same?
No, and in fact, it should.
-hpa
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: x86 32-bit machine check handler
2007-11-12 20:39 x86 32-bit machine check handler Max Asbock
2007-11-12 21:20 ` H. Peter Anvin
@ 2007-11-13 14:15 ` Andi Kleen
2007-11-15 1:06 ` Max Asbock
1 sibling, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2007-11-13 14:15 UTC (permalink / raw)
To: Max Asbock; +Cc: lkml, tglx, mingo, hpa
Max Asbock <masbock@us.ibm.com> writes:
> Now that the 32-bit and 64-bit x86 machine check handlers live next to
> each other a certain asymmetry in functionality is apparent. Notably,
> the 64-bit machine check handler implements a timer that periodically
> polls for silent machine check errors and makes them accessible to user
> space through /dev/mcelog.
Actually 32bit implements that too (non-fatal.c). But it misses some
of the more advanced functionality like AMD Threshold Interrupts.
> Are there reasons the x86 32-bit machine
> check handler couldn't do the same?
The 32bit machine check code has some serious design problems. The
best would be probably to just move 32bit over to the 64bit code too. In
fact there was a patch to do that some time ago, but it ran into some
minor problems and was unfortunately never merged. But it would be the
right thing to do.
The only missing functionality on the 64bit side would be support for
old non IA compliant old machine checks like P5 or WinChip. One option
would be to simply drop them. AFAIK these CPUs don't really have
anywhere near usable machine check capability anyways so dropping it
would not make much difference. Or alternatively keep p5.c/winchip.c
around. But if you look at them they don't do much except simple
printk with not much information and printk in a machine check handler
is always wrong because it can deadlock. I personally would prefer
dropping.
And I think one or two K7 quirks are also missing on 64bit, but these
would be very easy to add. Other than that it should just work on
32bit CPUs.
-Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: x86 32-bit machine check handler
2007-11-13 14:15 ` Andi Kleen
@ 2007-11-15 1:06 ` Max Asbock
2007-11-15 5:36 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Max Asbock @ 2007-11-15 1:06 UTC (permalink / raw)
To: Andi Kleen; +Cc: lkml, tglx, mingo, hpa
On Tue, 2007-11-13 at 15:15 +0100, Andi Kleen wrote:
> Max Asbock <masbock@us.ibm.com> writes:
>
> > Now that the 32-bit and 64-bit x86 machine check handlers live next to
> > each other a certain asymmetry in functionality is apparent. Notably,
> > the 64-bit machine check handler implements a timer that periodically
> > polls for silent machine check errors and makes them accessible to user
> > space through /dev/mcelog.
>
> Actually 32bit implements that too (non-fatal.c). But it misses some
> of the more advanced functionality like AMD Threshold Interrupts.
>
> > Are there reasons the x86 32-bit machine
> > check handler couldn't do the same?
>
> The 32bit machine check code has some serious design problems. The
> best would be probably to just move 32bit over to the 64bit code too. In
> fact there was a patch to do that some time ago, but it ran into some
> minor problems and was unfortunately never merged. But it would be the
> right thing to do.
I found patch from about three years ago that implemented a 32-bit
version of the x86_64 machine check handler. Do you know of any newer
attempts?
However, given the merge of x86, a single implementation should be able
to handle both the 32-bit and 64-bit cases. I tried to build the 64-bit
machine check handler (mce_64.c) for 32-bit to see what kind problems it
would run into. So far I found a few things:
- there is no idle_notifier_register in 32-bit x86
- there is no oops_begin in 32-bit x86
- register names are different (rip, cs)
- some data types would have to adjusted to be 64 bit
The issues seem to be surmountable.
> The only missing functionality on the 64bit side would be support for
> old non IA compliant old machine checks like P5 or WinChip. One option
> would be to simply drop them. AFAIK these CPUs don't really have
> anywhere near usable machine check capability anyways so dropping it
> would not make much difference. Or alternatively keep p5.c/winchip.c
> around. But if you look at them they don't do much except simple
> printk with not much information and printk in a machine check handler
> is always wrong because it can deadlock. I personally would prefer
> dropping.
>
> And I think one or two K7 quirks are also missing on 64bit, but these
> would be very easy to add. Other than that it should just work on
> 32bit CPUs.
>
So it looks like giving 32-bit x86 the same machine check support as in
64-bit is both feasible and desirable.
Are there any plans to do this or is anybody currently working on it?
thanks,
Max
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: x86 32-bit machine check handler
2007-11-15 1:06 ` Max Asbock
@ 2007-11-15 5:36 ` Andi Kleen
0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2007-11-15 5:36 UTC (permalink / raw)
To: Max Asbock; +Cc: Andi Kleen, lkml, tglx, mingo, hpa
> I found patch from about three years ago that implemented a 32-bit
> version of the x86_64 machine check handler. Do you know of any newer
> attempts?
No.
> However, given the merge of x86, a single implementation should be able
> to handle both the 32-bit and 64-bit cases. I tried to build the 64-bit
> machine check handler (mce_64.c) for 32-bit to see what kind problems it
> would run into. So far I found a few things:
> - there is no idle_notifier_register in 32-bit x86
There used to be one, just needs to be readded.
> - there is no oops_begin in 32-bit x86
> - register names are different (rip, cs)
regs->rip -> instruction_pointer()
->cs just needs a similar macro
> So it looks like giving 32-bit x86 the same machine check support as in
> 64-bit is both feasible and desirable.
Yep.
-Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-11-15 5:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-12 20:39 x86 32-bit machine check handler Max Asbock
2007-11-12 21:20 ` H. Peter Anvin
2007-11-13 14:15 ` Andi Kleen
2007-11-15 1:06 ` Max Asbock
2007-11-15 5:36 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox