public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* reason for delay in arch/x86/kernel/traps.c::io_check_error()?
@ 2009-03-09 19:33 Chris Friesen
  2009-03-10  3:49 ` Andi Kleen
  2009-03-10 15:32 ` Ingo Molnar
  0 siblings, 2 replies; 4+ messages in thread
From: Chris Friesen @ 2009-03-09 19:33 UTC (permalink / raw)
  To: linux-kernel, Andi Kleen, Ingo Molnar


Hi all,

I was just wondering about the basis for the delay in io_check_error(). 
  The ICH7 manual doesn't have any mention of a delay being required 
here--is it necessary for other hardware, something not mentioned in the 
manual, or just an accident?

Thanks,

Chris

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: reason for delay in arch/x86/kernel/traps.c::io_check_error()?
  2009-03-09 19:33 reason for delay in arch/x86/kernel/traps.c::io_check_error()? Chris Friesen
@ 2009-03-10  3:49 ` Andi Kleen
  2009-03-10 15:32 ` Ingo Molnar
  1 sibling, 0 replies; 4+ messages in thread
From: Andi Kleen @ 2009-03-10  3:49 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel, Andi Kleen, Ingo Molnar

On Mon, Mar 09, 2009 at 01:33:44PM -0600, Chris Friesen wrote:
> 
> Hi all,
> 
> I was just wondering about the basis for the delay in io_check_error(). 
>  The ICH7 manual doesn't have any mention of a delay being required 
> here--is it necessary for other hardware, something not mentioned in the 
> manual, or just an accident?

The complete NMI error logging handling does not really apply to any modern 
chipset; it's for ancient hardware (286, 386 generation). Those often
needed strange delays too.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: reason for delay in arch/x86/kernel/traps.c::io_check_error()?
  2009-03-09 19:33 reason for delay in arch/x86/kernel/traps.c::io_check_error()? Chris Friesen
  2009-03-10  3:49 ` Andi Kleen
@ 2009-03-10 15:32 ` Ingo Molnar
  2009-03-12 16:20   ` Chris Friesen
  1 sibling, 1 reply; 4+ messages in thread
From: Ingo Molnar @ 2009-03-10 15:32 UTC (permalink / raw)
  To: Chris Friesen
  Cc: linux-kernel, Andi Kleen, H. Peter Anvin, Thomas Gleixner,
	Arjan van de Ven, Yinghai Lu


* Chris Friesen <cfriesen@nortel.com> wrote:

> Hi all,
>
> I was just wondering about the basis for the delay in 
> io_check_error().  The ICH7 manual doesn't have any mention of 
> a delay being required here--is it necessary for other 
> hardware, something not mentioned in the manual, or just an 
> accident?

That code has seriously bitrotten along the years. All those 
port 61H accesses:

arch/x86/kernel/traps.c:                reason = get_nmi_reason();
arch/x86/kernel/traps.c:        outb(reason, 0x61);
arch/x86/kernel/traps.c:        outb(reason, 0x61);
arch/x86/kernel/traps.c:        outb(reason, 0x61);

... are often wrong on modern chipsets - including the logic in 
io_check_error(). But we dont really have lowlevel chipset 
drivers on this level in Linux, so there's nothing suitable to 
replace it with and it never got fixed.

Can you see this trigger on a box perhaps? Or are you worried 
about the potential unbound execution time of this function 
which can be up to 2 seconds in NMI context?

	Ingo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: reason for delay in arch/x86/kernel/traps.c::io_check_error()?
  2009-03-10 15:32 ` Ingo Molnar
@ 2009-03-12 16:20   ` Chris Friesen
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Friesen @ 2009-03-12 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andi Kleen, H. Peter Anvin, Thomas Gleixner,
	Arjan van de Ven, Yinghai Lu

Ingo Molnar wrote:
> * Chris Friesen <cfriesen@nortel.com> wrote:
> 
>> Hi all,
>>
>> I was just wondering about the basis for the delay in 
>> io_check_error().  The ICH7 manual doesn't have any mention of 
>> a delay being required here--is it necessary for other 
>> hardware, something not mentioned in the manual, or just an 
>> accident?
> 
> That code has seriously bitrotten along the years. All those 
> port 61H accesses:
> 
> arch/x86/kernel/traps.c:                reason = get_nmi_reason();
> arch/x86/kernel/traps.c:        outb(reason, 0x61);
> arch/x86/kernel/traps.c:        outb(reason, 0x61);
> arch/x86/kernel/traps.c:        outb(reason, 0x61);
> 
> ... are often wrong on modern chipsets - including the logic in 
> io_check_error(). But we dont really have lowlevel chipset 
> drivers on this level in Linux, so there's nothing suitable to 
> replace it with and it never got fixed.
> 
> Can you see this trigger on a box perhaps? Or are you worried 
> about the potential unbound execution time of this function 
> which can be up to 2 seconds in NMI context?

This is in the context of an embedded highly available compute blade. 
As part of our enhanced error handling we've modified the memory parity 
error code to reenable rather than disable the error line.

Given that the memory and IO code paths are just different bits in the 
same register we originally added the delay to the memory parity path as 
well.  However, we subsequently hit the memory parity error path, and 
the 2sec delay triggered our hardware watchdog causing the board to reboot.

As you can imagine this is undesirable, so we were hoping to remove the 
delay from both paths.  From what you've said and the fact that no delay 
is mentioned in the chip manual, it seems like this should be fairly safe.

Thanks,
Chris

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-03-12 16:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-09 19:33 reason for delay in arch/x86/kernel/traps.c::io_check_error()? Chris Friesen
2009-03-10  3:49 ` Andi Kleen
2009-03-10 15:32 ` Ingo Molnar
2009-03-12 16:20   ` Chris Friesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox