* 3c59x + dpti2o problem with interrupt sharing?
@ 2001-09-25 5:17 Brian Strand
0 siblings, 0 replies; 3+ messages in thread
From: Brian Strand @ 2001-09-25 5:17 UTC (permalink / raw)
To: linux-kernel
We just got a new dual AMD box with an Adaptec IDE RAID card and a dual
3com NIC on the motherboard, running stock suse 7.2 kernel 2.4.4-Suse.
All was well for a few days, but when the box was doing some moderate
Oracle work involving the 3com and the raid card, the box fell off the
network, and /var/log/messages was suddenly flooded with the following:
Sep 22 14:25:34 db001 kernel: eth0: Host error, FIFO diagnostic register
0000.
Sep 22 14:25:34 db001 kernel: scsi0: PCI error Interrupt at seqaddr = 0x8
Sep 22 14:25:34 db001 kernel: eth0: PCI bus error, bus status 800000a0
Sep 22 14:25:34 db001 kernel: scsi0: Data Parity Error Detected during
address or write data phase
Sep 22 14:25:34 db001 kernel: eth0: using NWAY device table, not 8
Sep 22 14:25:34 db001 kernel: scsi1: PCI error Interrupt at seqaddr = 0x8
Sep 22 14:25:34 db001 kernel: scsi1: Data Parity Error Detected during
address or write data phase
Sep 22 14:25:34 db001 kernel: eth0: Host error, FIFO diagnostic register
0000.
Sep 22 14:25:34 db001 kernel: eth0: PCI bus error, bus status 80000020
Sep 22 14:25:34 db001 kernel: eth0: using NWAY device table, not 8
Sep 22 14:25:35 db001 kernel: eth0: Host error, FIFO diagnostic register
0000.
Sep 22 14:25:35 db001 kernel: eth0: PCI bus error, bus status 80000020
Sep 22 14:25:35 db001 kernel: eth0: using NWAY device table, not 8
And the last 3 lines repeat for roughly 300000 lines. Stripping the
timestamps and doing an egrep 'eth|scsi' /var/log/messages | sort |
uniq of those lines, I got:
NETDEV WATCHDOG: eth0: transmit timed out
eth0: 3Com PCI 3c980 10/100 Base-TX NIC(Python-T) at 0x1400, 00:e0:81
eth0: 3Com PCI 3c980 10/100 Base-TX NIC(Python-T) at 0x1c00, 00:e0:81
eth0: Host error, FIFO diagnostic register 0000.
eth0: Interrupt posted but not delivered -- IRQ blocked by another dev
eth0: PCI bus error, bus status 80000020
eth0: PCI bus error, bus status 800000a0
eth0: Resetting the Tx ring pointer.
eth0: Too much work in interrupt, status e003.
eth0: transmit timed out, tx_status 00 status 7003.
eth0: transmit timed out, tx_status 00 status 7043.
eth0: using NWAY device table, not 8
scsi0: Data Parity Error Detected during address or write data phase
scsi0: PCI error Interrupt at seqaddr = 0x8
scsi1: Data Parity Error Detected during address or write data phase
scsi1: PCI error Interrupt at seqaddr = 0x8
Looking in /proc/interrupts, I noticed that eth0 and dpti were sharing
an IRQ. Is this the likely cause of the network failure, and if so,
does anyone know of a way to get the PCI BIOS to assign separate IRQs to
the RAID card and the dual 3com? (I have a Tyan S2462 Thunder K7 board
with nothing in the manual about this.) I have disabled onboard SCSI
(dual AIC7xxx), serial, and parallel as well as pulled the RAID card
from the machine and power-cycled a few times, but when I put it back
in, it's sharing an IRQ with the 3com again (I suppose I should try
disabling/enabling the 3coms too).
A related question is: should these drivers be able to share IRQs, i.e.
is it a worthwhile goal to have them operate reliably while sharing
IRQs, or is IRQ-sharing a performance loss and something to be avoided?
Thanks,
Brian Strand
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: 3c59x + dpti2o problem with interrupt sharing?
@ 2001-09-25 18:57 Bonds, Deanna
2001-09-25 22:24 ` Brian Strand
0 siblings, 1 reply; 3+ messages in thread
From: Bonds, Deanna @ 2001-09-25 18:57 UTC (permalink / raw)
To: 'Brian Strand', linux-kernel
> Looking in /proc/interrupts, I noticed that eth0 and dpti
> were sharing
> an IRQ. Is this the likely cause of the network failure, and if so,
> does anyone know of a way to get the PCI BIOS to assign
> separate IRQs to
> A related question is: should these drivers be able to share
> IRQs, i.e.
> is it a worthwhile goal to have them operate reliably while sharing
> IRQs, or is IRQ-sharing a performance loss and something to
> be avoided?
The Adaptec card can share interrupts, but it is not wise to do that with
another card that is going to be a high priority interrupt. You most likely
need to change the motherboard bios settings. If you are not using your
onboard IDE you can disable that freeing up another high priority interrupt.
Otherwise you can manually assign the interrupts through the bios.
Deanna
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 3c59x + dpti2o problem with interrupt sharing?
2001-09-25 18:57 3c59x + dpti2o problem with interrupt sharing? Bonds, Deanna
@ 2001-09-25 22:24 ` Brian Strand
0 siblings, 0 replies; 3+ messages in thread
From: Brian Strand @ 2001-09-25 22:24 UTC (permalink / raw)
To: Bonds, Deanna; +Cc: linux-kernel
Bonds, Deanna wrote:
>>A related question is: should these drivers be able to share
>>IRQs, i.e.
>>is it a worthwhile goal to have them operate reliably while sharing
>>IRQs, or is IRQ-sharing a performance loss and something to
>>be avoided?
>>
>
>The Adaptec card can share interrupts, but it is not wise to do that with
>another card that is going to be a high priority interrupt. You most likely
>need to change the motherboard bios settings. If you are not using your
>onboard IDE you can disable that freeing up another high priority interrupt.
>Otherwise you can manually assign the interrupts through the bios
>
I disabled onboard IDE (secondary channel; primary is needed for CDROM),
serial, parallel, scsi, usb, and one of two onboard NICs, and now I am
in a situation with no IRQ sharing. Unfortunately this BIOS (Phoenix
ServerBIOS 2 Rel 6.0, Tyan Thunder K7 BIOS v2.07a) does not allow me to
assign IRQs to PCI slots, and I cannot move the RAID card around because
it is in a 2U riser card. Hopefully this will solve the problem. On a
broader note, where is the cause of this problem? You indicated that
the Adaptec card can share interrupts, so was the problem the 3com
driver, the 3com hardware, the motherboard, or some other portion of the
kernel? It seems that given the scarcity of interrupts, this situation
probably happens a lot, so we should handle it gracefully (meaning that
I should go investigate the 3c59x and dpti2o drivers some more if it is
a problem which is solvable in software).
Thanks,
Brian Strand
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2001-09-25 22:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-09-25 18:57 3c59x + dpti2o problem with interrupt sharing? Bonds, Deanna
2001-09-25 22:24 ` Brian Strand
-- strict thread matches above, loose matches on Subject: below --
2001-09-25 5:17 Brian Strand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox