* High Interrupt load crashes SMP Athlon MPs
@ 2005-02-09 19:54 Eric Bambach
0 siblings, 0 replies; only message in thread
From: Eric Bambach @ 2005-02-09 19:54 UTC (permalink / raw)
To: linux-kernel
Hello,
After doing much research I have come to the conclusion that the kernel might
be at fault (in conjuction with the mobo) for hard-locking my box. Please
read below to see if you can help me.
I am coming to wits end with this MSI K7D Master-L board. I have narrowed it
down to find that anything that causes alot of interrupts will lock the box.
By lock I mean a HARD lock, no ping, no mouse, no oops, no sysrq, no nothing.
It is a sudden and TOTAL lockup. A ping -f and playing xmms at the same time
will cause the box to lock in minutes while as if there is very little
activity I can run for perhaps an hour or so maybe more (havent had the time
or the patience to baby-sit an idle box for more than an hour. This is my
main box.)
When I ran this box as a single-cpu system with an athlon mp2400 it ran fine.
Perhaps something with SMP is triggering a nasty bug.
The motherboard was bought to replace another motherboard and so I could go
SMP. The ram, powersupply,ALL cards, basically all the hardware are known
good and have been used for months in another system. The LCD here is the
mobo.
I have googled extensivly and have tried many things to see if I can alleviate
the problem, so far nothing. Does anyone have any ideas to see if this is a
kernel problem?
Is this a known problem? Is there a patch to fix this? I am trying to avoid
replacing/returning such a beautiful and expensive motherboard.
Here is excerpt from a Redhat mailing list
>If your motherboard's using the AMD-768 chipset for the Southbridge, you
>may have run afoul of a bug in interrupt masking which can hang the
>system. The reports thus far on the linux kernel list imply that plugging
>in a PS/2 mouse seems to work around the problem; it's worked for me (MSI
>K7D Master board with dual Athlons), though I've only had a few days'
>trial so far.
So it seems to suggest something in the kernel has something in part to do
with the lockup. Anyone have any suggestions? Any info I can provide?
Here is a non-exhaustive unordered list of various things I've tried.
-Combonations of noapic nolapic acpi=off
-Increasing the vcore slightly
-Downloading and compiling the latest kernel (see uname -a output)
-Recompiled the kernel, ran make clean
-turning DMA off in the bios
-turning DMS off via hdparm
-running without a PATA drive at all (all scsi)
-Removing ALL unneccesary cards
-Removing ALL unnecesary devices (to recude power consumption)
-Disabling USB and removing USB support from the kernel
-Installing sensors and making sure voltages/temps are nominal (they are)
-installed and used irqbalance
-Disable preempt
-Try the onboard lan and a 64bit pci gigabit lan card mutually exclusively
-Removed side-panel and verified heat is not an issue
-Updated to latest BIOS version
---------------hardware and kernel specs--------------
bot403@eric bot403 $ su
Password:
</home/bot403:13:32:45>
root@eric >uname -a
Linux eric 2.6.10-gentoo-r6 #2 SMP Tue Feb 8 17:12:59 CST 2005 i686 AMD
Athlon(tm) MP 2600+ AuthenticAMD GNU/Linux
</home/bot403:13:32:47>
root@eric >lspci -v
0000:00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P]
System Controller (rev 11)
Flags: bus master, 66Mhz, medium devsel, latency 32
Memory at e8000000 (32-bit, prefetchable)
Memory at fd005000 (32-bit, prefetchable) [size=4K]
I/O ports at ec00 [disabled] [size=4]
Capabilities: [a0] AGP version 2.0
0000:00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP
Bridge (prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 32
Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
Memory behind bridge: f8000000-f9ffffff
Prefetchable memory behind bridge: f0000000-f7ffffff
0000:00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev
05)
Flags: bus master, 66Mhz, medium devsel, latency 0
0000:00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE
(rev 04) (prog-if 8a [Master SecP PriP])
Subsystem: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE
Flags: bus master, medium devsel, latency 32
I/O ports at e000 [size=16]
0000:00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI (rev 03)
Subsystem: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI
Flags: medium devsel
0000:00:09.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 Ultra3
SCSI Adapter (rev 01)
Subsystem: LSI Logic / Symbios Logic: Unknown device 1030
Flags: bus master, medium devsel, latency 72, IRQ 153
I/O ports at e400
Memory at fd006000 (64-bit, non-prefetchable) [size=1K]
Memory at fd002000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [40] Power Management version 2
0000:00:09.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 Ultra3
SCSI Adapter (rev 01)
Subsystem: LSI Logic / Symbios Logic: Unknown device 1030
Flags: bus master, medium devsel, latency 72, IRQ 161
I/O ports at e800
Memory at fd004000 (64-bit, non-prefetchable) [size=1K]
Memory at fd000000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [40] Power Management version 2
0000:00:10.0 PCI bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] PCI (rev
05) (prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 32
Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: fb000000-fcffffff
Expansion ROM at 0000d000 [disabled] [size=4K]
0000:01:05.0 VGA compatible controller: nVidia Corporation NV28 [GeForce4 Ti
4200 AGP 8x] (rev a1) (prog-if 00 [VGA])
Flags: bus master, 66Mhz, medium devsel, latency 248, IRQ 153
Memory at f8000000 (32-bit, non-prefetchable)
Memory at f0000000 (32-bit, prefetchable) [size=128M]
Capabilities: [60] Power Management version 2
Capabilities: [44] AGP version 3.0
0000:02:09.0 Ethernet controller: Intel Corp. 82559ER (rev 09)
Subsystem: Intel Corp.: Unknown device 3000
Flags: bus master, medium devsel, latency 32, IRQ 153
Memory at fc020000 (32-bit, non-prefetchable)
I/O ports at d000 [size=64]
Memory at fc000000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [dc] Power Management version 2
</home/bot403:13:32:49>
root@eric >cat /proc/interrupts
CPU0 CPU1
0: 315584 309379 IO-APIC-edge timer
1: 638 724 IO-APIC-edge i8042
2: 0 0 XT-PIC cascade
8: 2 0 IO-APIC-edge rtc
12: 8054 11784 IO-APIC-edge i8042
14: 1879 1247 IO-APIC-edge ide0
153: 1174407 1153144 IO-APIC-level sym53c8xx, eth0, nvidia
161: 0 0 IO-APIC-level sym53c8xx
NMI: 0 0
LOC: 624860 624868
ERR: 0
MIS: 0
</home/bot403:13:32:52>
root@eric >cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 10
model name : AMD Athlon(tm) MP 2600+
stepping : 0
cpu MHz : 1999.946
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmovpat pse36 mmx fxsr sse pni syscall mp mmxext 3dnowext 3dnow
bogomips : 3932.16
processor : 1
vendor_id : AuthenticAMD
cpu family : 6
model : 10
model name : AMD Athlon(tm) MP
stepping : 0
cpu MHz : 1999.946
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmovpat pse36 mmx fxsr sse pni syscall mp mmxext 3dnowext 3dnow
bogomips : 3989.50
</home/bot403:13:32:55>
root@eric >
---------------EOF hardware and kernel specs EOF--------------
----------------------------------------
--EB
> All is fine except that I can reliably "oops" it simply by trying to read
> from /proc/apm (e.g. cat /proc/apm).
> oops output and ksymoops-2.3.4 output is attached.
> Is there anything else I can contribute?
The latitude and longtitude of the bios writers current position, and
a ballistic missile.
--Alan Cox LKML-December 08,2000
----------------------------------------
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2005-02-09 19:57 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-09 19:54 High Interrupt load crashes SMP Athlon MPs Eric Bambach
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.