All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel lockups on dual-Athlon board -- help wanted
@ 2001-08-11 10:23 Eric S. Raymond
  2001-08-11 10:46 ` Alex Buell
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Eric S. Raymond @ 2001-08-11 10:23 UTC (permalink / raw)
  To: Linux Kernel List

Gary Sandine of Los Alamos Computers and I are attempting to qualify
Linux on a Tyan 2462 K7 Thunder motherboard -- dual Athlon 1200 MP
chips supported by an AMD 760 chipset.  We have been seeing mysterious
lockups during commands to build things from source, like kernels and X.

We've been trying to track down the problem for about sixteen hours
and have gathered quite a bit of data, but don't have a theory to  explain
it.

First, we have established that this is a real kernel hang, not just a 
bad device state:

A. Lockups can be induced in either console or X mode.  A reliable way to 
   induce them is to run `make clean' on an X tree (any sufficiently 
   long-running command seems to do it).

B. We logged in over the network, started a top(1) in the network
   session, induced the hang on the console, and watch top(1) freeze.
   So 

C. The magic AltSysRq command is ineffective when the lockups happen.

Here's what we know about it:

1. Lockups never occur under a uniprocessor kernel.

2. Configuring APM and ACPI out of the kernel does not prevent the lockups.
   Disabling ACPI and power management doesn't stop them either.

3. Changing kernels from 2.4.3 to 2.4.7 doesn't prevent the lockups.

4. The SMP kernel built for either PII or AMD (no APM, no ACPI) locks up.

5. There is an undocumented BIOS setting "Use PCI Interrupt Entries in 
   MP table."  By default it is on.  Turning it off doesn't prevent the
   lockups.

6. Here's a weird one.  When the kernel is running, the power switch
   has to be pressed down for 4 seconds to power down the machine.  But
   during a lockup it powers down the machine instantly.

What we're seeing suggests some bad interaction between the SMP
support and the hardware.  But item 7 hints that power management
could be involved, even though we have it configured out.

Anybody have a brilliant insight?  Suggestions for further tests?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government should be weak, amateurish and ridiculous. At present, it
fulfills only a third of the role.
	-- Edward Abbey

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2001-08-11 19:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-11 10:23 Kernel lockups on dual-Athlon board -- help wanted Eric S. Raymond
2001-08-11 10:46 ` Alex Buell
2001-08-11 16:22   ` Eric S. Raymond
2001-08-11 13:19 ` Alan Cox
2001-08-11 16:32   ` Eric S. Raymond
2001-08-11 16:44     ` Johannes Erdfelt
2001-08-11 16:50       ` Eric S. Raymond
2001-08-11 10:09         ` John Heil
2001-08-11 17:22           ` Eric S. Raymond
2001-08-11 10:30             ` John Heil
2001-08-11 17:57             ` Charles Cazabon
2001-08-11 19:31         ` Ben LaHaise
2001-08-11 16:08 ` Charles Cazabon
2001-08-11 16:17 ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.