* e100 PCI bridge problem
@ 2007-07-13 17:37 William Montgomery
2007-07-13 20:36 ` Kok, Auke
2007-07-14 14:43 ` Krzysztof Halasa
0 siblings, 2 replies; 14+ messages in thread
From: William Montgomery @ 2007-07-13 17:37 UTC (permalink / raw)
To: linux-kernel
In an earlier post to the list I described a hard lockup condition
that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
and occurs on 2 out of 3 computers.
Further testing has revealed that the lockup can be prevented on all
computers by making sure the card is installed on the primary PCI bus.
If the card is installed in a slot on the secondary PCI bus (behind a
PCI to PCI bridge) the lockup occurs.
Are there any PCI tuning registers that I can tweak to get around
this problem? Any changes I could make to the e100 driver to fix this?
Any help appreciated.
Regards,
Wm
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-13 17:37 e100 PCI bridge problem William Montgomery
@ 2007-07-13 20:36 ` Kok, Auke
2007-07-13 22:30 ` William Montgomery
2007-07-14 14:43 ` Krzysztof Halasa
1 sibling, 1 reply; 14+ messages in thread
From: Kok, Auke @ 2007-07-13 20:36 UTC (permalink / raw)
To: William Montgomery; +Cc: linux-kernel
William Montgomery wrote:
> In an earlier post to the list I described a hard lockup condition
> that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
> a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
> and occurs on 2 out of 3 computers.
>
> Further testing has revealed that the lockup can be prevented on all
> computers by making sure the card is installed on the primary PCI bus.
> If the card is installed in a slot on the secondary PCI bus (behind a
> PCI to PCI bridge) the lockup occurs.
sounds like int-A/B/C/D routing issues
> Are there any PCI tuning registers that I can tweak to get around
> this problem? Any changes I could make to the e100 driver to fix this?
this issue might be resolvable by quirking the bridgee chips and adjusting any
APIC where needed. Unfortunately I don't know much about this but it's
physically not possible from the e100 driver. The special (non-intel) card that
has these 4 ports onboard contains a bridge chip itself which explains the
issues. Even a BIOS issue could be the cause here.
Perhaps the linuxfirmwarekit will reveal more information. In any case, fixing
this in software would be a gigantic effort.
Auke
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-13 20:36 ` Kok, Auke
@ 2007-07-13 22:30 ` William Montgomery
2007-07-13 22:41 ` Kok, Auke
0 siblings, 1 reply; 14+ messages in thread
From: William Montgomery @ 2007-07-13 22:30 UTC (permalink / raw)
To: Kok, Auke; +Cc: linux-kernel
Thanks for responding. I am very interested to find the source of this
problem.
Kok, Auke wrote:
> William Montgomery wrote:
>
>> In an earlier post to the list I described a hard lockup condition
>> that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
>> a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
>> and occurs on 2 out of 3 computers.
>>
>> Further testing has revealed that the lockup can be prevented on all
>> computers by making sure the card is installed on the primary PCI bus.
>> If the card is installed in a slot on the secondary PCI bus (behind a
>> PCI to PCI bridge) the lockup occurs.
>
>
> sounds like int-A/B/C/D routing issues
The strange thing is that all the ports on the card work fine for a few
minutes, then when some condition (as yet unknown) occurs the system
locks up hard. I am currently using a PCI bus analyzer to capture bus
activity just prior to the lockup to try and find out what leads up to
this condition.
>
>> Are there any PCI tuning registers that I can tweak to get around
>> this problem? Any changes I could make to the e100 driver to fix this?
>
>
> this issue might be resolvable by quirking the bridgee chips and
> adjusting any APIC where needed. Unfortunately I don't know much about
> this but it's physically not possible from the e100 driver. The
> special (non-intel) card that has these 4 ports onboard contains a
> bridge chip itself which explains the issues. Even a BIOS issue could
> be the cause here.
I am aware of the bridge chip on the card but not sure what you mean
when you say this explains the issues? I sure would like to figure out
a way around this.
The PCI info follows:
00:00.0 Host bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
Controller/Host-Hub Interface (rev 03)
00:02.0 VGA compatible controller: Intel Corp.
82845G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 03)
00:1d.0 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #3 (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801DB (ICH4) USB2 EHCI
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to
PCI Bridge (rev 82)
00:1f.0 ISA bridge: Intel Corp. 82801DB (ICH4) LPC Bridge (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801DB (ICH4) Ultra ATA 100
Storage Controller (rev 02)
00:1f.3 SMBus: Intel Corp. 82801DB/DBM (ICH4) SMBus Controller (rev 02)
01:08.0 Ethernet controller: Intel Corp. 82801BD PRO/100 VE (CNR)
Ethernet Controller (rev 82)
01:0c.0 PCI bridge: Pericom Semiconductor: Unknown device 8150 (rev 02)
02:06.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge
(non-transparent mode) (rev 15)
03:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
03:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
03:06.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
03:07.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
>
> Perhaps the linuxfirmwarekit will reveal more information. In any
> case, fixing this in software would be a gigantic effort.
>
I will look into that on Monday and report what I find. It seems like
it is premature to say how much effort the fix will take since the
problem is not yet known? At least not known to me yet. I would just
like to find out what parameters on the bridge/bridges might affect this
problem and how to modify them.
> Auke
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-13 22:30 ` William Montgomery
@ 2007-07-13 22:41 ` Kok, Auke
2007-07-14 0:54 ` William Montgomery
0 siblings, 1 reply; 14+ messages in thread
From: Kok, Auke @ 2007-07-13 22:41 UTC (permalink / raw)
To: William Montgomery; +Cc: linux-kernel
William Montgomery wrote:
> Thanks for responding. I am very interested to find the source of this
> problem.
>
> Kok, Auke wrote:
>
>> William Montgomery wrote:
>>
>>> In an earlier post to the list I described a hard lockup condition
>>> that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
>>> a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
>>> and occurs on 2 out of 3 computers.
>>>
>>> Further testing has revealed that the lockup can be prevented on all
>>> computers by making sure the card is installed on the primary PCI bus.
>>> If the card is installed in a slot on the secondary PCI bus (behind a
>>> PCI to PCI bridge) the lockup occurs.
>>
>> sounds like int-A/B/C/D routing issues
>
> The strange thing is that all the ports on the card work fine for a few
> minutes, then when some condition (as yet unknown) occurs the system
> locks up hard. I am currently using a PCI bus analyzer to capture bus
> activity just prior to the lockup to try and find out what leads up to
> this condition.
are you running any form of irqbalance, either in-kernel (bad) or the userspace
(better) one?
>>> Are there any PCI tuning registers that I can tweak to get around
>>> this problem? Any changes I could make to the e100 driver to fix this?
>>
>> this issue might be resolvable by quirking the bridgee chips and
>> adjusting any APIC where needed. Unfortunately I don't know much about
>> this but it's physically not possible from the e100 driver. The
>> special (non-intel) card that has these 4 ports onboard contains a
>> bridge chip itself which explains the issues. Even a BIOS issue could
>> be the cause here.
>
> I am aware of the bridge chip on the card but not sure what you mean
> when you say this explains the issues? I sure would like to figure out
> a way around this.
irq routing in linux may not be the same as in windows. I have no idea how to
compare them either (dmesg will show the linux setup, but I don't know how to
retreive this info under linux).
> The PCI info follows:
> 00:00.0 Host bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
> Controller/Host-Hub Interface (rev 03)
> 00:02.0 VGA compatible controller: Intel Corp.
> 82845G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 03)
> 00:1d.0 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #1 (rev 02)
> 00:1d.1 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #2 (rev 02)
> 00:1d.2 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #3 (rev 02)
> 00:1d.7 USB Controller: Intel Corp. 82801DB (ICH4) USB2 EHCI
> Controller (rev 02)
> 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to
> PCI Bridge (rev 82)
> 00:1f.0 ISA bridge: Intel Corp. 82801DB (ICH4) LPC Bridge (rev 02)
> 00:1f.1 IDE interface: Intel Corp. 82801DB (ICH4) Ultra ATA 100
> Storage Controller (rev 02)
> 00:1f.3 SMBus: Intel Corp. 82801DB/DBM (ICH4) SMBus Controller (rev 02)
> 01:08.0 Ethernet controller: Intel Corp. 82801BD PRO/100 VE (CNR)
> Ethernet Controller (rev 82)
> 01:0c.0 PCI bridge: Pericom Semiconductor: Unknown device 8150 (rev 02)
> 02:06.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge
> (non-transparent mode) (rev 15)
> 03:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
> (rev 08)
> 03:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
> (rev 08)
> 03:06.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
> (rev 08)
> 03:07.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
> (rev 08)
>
>> Perhaps the linuxfirmwarekit will reveal more information. In any
>> case, fixing this in software would be a gigantic effort.
>>
> I will look into that on Monday and report what I find. It seems like
> it is premature to say how much effort the fix will take since the
> problem is not yet known? At least not known to me yet. I would just
> like to find out what parameters on the bridge/bridges might affect this
> problem and how to modify them.
I personally have no idea and am not knowledgeable enough on this issue, sorry :)
Auke
>
>> Auke
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-13 22:41 ` Kok, Auke
@ 2007-07-14 0:54 ` William Montgomery
0 siblings, 0 replies; 14+ messages in thread
From: William Montgomery @ 2007-07-14 0:54 UTC (permalink / raw)
To: Kok, Auke; +Cc: linux-kernel
Kok, Auke wrote:
> William Montgomery wrote:
>
>> Thanks for responding. I am very interested to find the source of
>> this problem.
>>
>> Kok, Auke wrote:
>>
>>> William Montgomery wrote:
>>>
>>>> In an earlier post to the list I described a hard lockup condition
>>>> that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
>>>> a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
>>>> and occurs on 2 out of 3 computers.
>>>>
>>>> Further testing has revealed that the lockup can be prevented on all
>>>> computers by making sure the card is installed on the primary PCI bus.
>>>> If the card is installed in a slot on the secondary PCI bus (behind a
>>>> PCI to PCI bridge) the lockup occurs.
>>>
>>>
>>> sounds like int-A/B/C/D routing issues
>>
>>
>> The strange thing is that all the ports on the card work fine for a
>> few minutes, then when some condition (as yet unknown) occurs the
>> system locks up hard. I am currently using a PCI bus analyzer to
>> capture bus activity just prior to the lockup to try and find out
>> what leads up to this condition.
>
>
> are you running any form of irqbalance, either in-kernel (bad) or the
> userspace (better) one?
No. This is a Pentium 4 - single core, 2.8GHz.
>
>>>> Are there any PCI tuning registers that I can tweak to get around
>>>> this problem? Any changes I could make to the e100 driver to fix
>>>> this?
>>>
>>>
>>> this issue might be resolvable by quirking the bridgee chips and
>>> adjusting any APIC where needed. Unfortunately I don't know much
>>> about this but it's physically not possible from the e100 driver.
>>> The special (non-intel) card that has these 4 ports onboard contains
>>> a bridge chip itself which explains the issues. Even a BIOS issue
>>> could be the cause here.
>>
>>
>> I am aware of the bridge chip on the card but not sure what you mean
>> when you say this explains the issues? I sure would like to figure
>> out a way around this.
>
>
> irq routing in linux may not be the same as in windows. I have no idea
> how to compare them either (dmesg will show the linux setup, but I
> don't know how to retreive this info under linux).
Not sure how windows applies here; I only use Linux. The main data
point so far is that the card works fine when on the primary PCI bus but
locks up hard after a few minutes when installed in a slot behind a PCI
to PCI bridge. I can provide the dmesg info on Monday.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-13 17:37 e100 PCI bridge problem William Montgomery
2007-07-13 20:36 ` Kok, Auke
@ 2007-07-14 14:43 ` Krzysztof Halasa
2007-07-14 23:17 ` William Montgomery
1 sibling, 1 reply; 14+ messages in thread
From: Krzysztof Halasa @ 2007-07-14 14:43 UTC (permalink / raw)
To: William Montgomery; +Cc: linux-kernel
William Montgomery <william@opinicus.com> writes:
> In an earlier post to the list I described a hard lockup condition
> that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
> a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
> and occurs on 2 out of 3 computers.
>
> Further testing has revealed that the lockup can be prevented on all
> computers by making sure the card is installed on the primary PCI bus.
> If the card is installed in a slot on the secondary PCI bus (behind a
> PCI to PCI bridge) the lockup occurs.
Does the machine #3 have a PCI slot connected to a "secondary" bus?
Have you tried with any other machine with a secondary bus?
> Are there any PCI tuning registers that I can tweak to get around
> this problem? Any changes I could make to the e100 driver to fix this?
Could be a hardware/BIOS problem on machines #1 and #2. Could be
a Linux bug as well, though similar configurations are known to work
fine. I don't think it has anything to do with IRQs.
Perhaps it doesn't like a bridge (on the card) behind a bridge
(on the motherboard). I would test with another multiport card
such as old DLink DFE-570TX (using a DEC 21150 bridge and four
21143 Ethernet chips).
I'd probably use some PCI analyzer or, at least, I'd check
the bus state with a multimeter.
--
Krzysztof Halasa
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-14 14:43 ` Krzysztof Halasa
@ 2007-07-14 23:17 ` William Montgomery
2007-07-14 23:49 ` Krzysztof Halasa
0 siblings, 1 reply; 14+ messages in thread
From: William Montgomery @ 2007-07-14 23:17 UTC (permalink / raw)
To: Krzysztof Halasa; +Cc: linux-kernel
Krzysztof Halasa wrote:
>William Montgomery <william@opinicus.com> writes:
>
>
>
>>In an earlier post to the list I described a hard lockup condition
>>that occurs on linux kernels 2.4.22, 2.6.13, and 2.6.17 when using
>>a 4 port 10/100 fast ethernet card. The lockup is easily repeatable
>>and occurs on 2 out of 3 computers.
>>
>>Further testing has revealed that the lockup can be prevented on all
>>computers by making sure the card is installed on the primary PCI bus.
>>If the card is installed in a slot on the secondary PCI bus (behind a
>>PCI to PCI bridge) the lockup occurs.
>>
>>
>
>Does the machine #3 have a PCI slot connected to a "secondary" bus?
>Have you tried with any other machine with a secondary bus?
>
>
>
The #3 machine doesn't have a secondary bus. #1 and #2 are from 2
different vendors (#1 Advantech - #2 Axiomtek) and I havent tried any
othe machines.
>>Are there any PCI tuning registers that I can tweak to get around
>>this problem? Any changes I could make to the e100 driver to fix this?
>>
>>
>
>Could be a hardware/BIOS problem on machines #1 and #2. Could be
>a Linux bug as well, though similar configurations are known to work
>fine. I don't think it has anything to do with IRQs.
>
>Perhaps it doesn't like a bridge (on the card) behind a bridge
>(on the motherboard). I would test with another multiport card
>such as old DLink DFE-570TX (using a DEC 21150 bridge and four
>21143 Ethernet chips).
>
>I'd probably use some PCI analyzer or, at least, I'd check
>the bus state with a multimeter.
>
>
The #1 and #2 machines are known to work with an older Adaptec ANA-62044
4port NIC (tulip based) with an onboard Intel 21154 bridge chip. The
card I am having problems with uses an onboard Hint Corp HB6 Universal
PCI-PCI bridge.
I am using a PCI analyzer and it shows the bus in an idle state after
the lockup. The PCI transactions just prior to the lockup show a couple
of interrupts from the card which appear to be handled correctly.
Anything I should be looking for in particular?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-14 23:17 ` William Montgomery
@ 2007-07-14 23:49 ` Krzysztof Halasa
2007-07-15 1:27 ` William Montgomery
2007-07-17 18:29 ` William Montgomery
0 siblings, 2 replies; 14+ messages in thread
From: Krzysztof Halasa @ 2007-07-14 23:49 UTC (permalink / raw)
To: William Montgomery; +Cc: linux-kernel
William Montgomery <william@opinicus.com> writes:
> I am using a PCI analyzer and it shows the bus in an idle state after
> the lockup. The PCI transactions just prior to the lockup show a
> couple of interrupts from the card which appear to be handled
> correctly. Anything I should be looking for in particular?
I'd try to check with other machine using "secondary" bus slot.
BTW: Are you able to analyze the "primary" bus transactions while
using the card in "secondary" bus? Perhaps there is something
wrong in front of the motherboard bridge?
A broken motherboard may be hard to diagnose, unfortunately.
Can you post something like "lspci -vv" taken on both machines?
--
Krzysztof Halasa
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-14 23:49 ` Krzysztof Halasa
@ 2007-07-15 1:27 ` William Montgomery
2007-07-17 18:29 ` William Montgomery
1 sibling, 0 replies; 14+ messages in thread
From: William Montgomery @ 2007-07-15 1:27 UTC (permalink / raw)
To: Krzysztof Halasa; +Cc: linux-kernel
Krzysztof Halasa wrote:
>William Montgomery <william@opinicus.com> writes:
>
>
>
>>I am using a PCI analyzer and it shows the bus in an idle state after
>>the lockup. The PCI transactions just prior to the lockup show a
>>couple of interrupts from the card which appear to be handled
>>correctly. Anything I should be looking for in particular?
>>
>>
>
>I'd try to check with other machine using "secondary" bus slot.
>BTW: Are you able to analyze the "primary" bus transactions while
>using the card in "secondary" bus? Perhaps there is something
>wrong in front of the motherboard bridge?
>
>A broken motherboard may be hard to diagnose, unfortunately.
>
>Can you post something like "lspci -vv" taken on both machines?
>
>
I will post more info on Monday when I am able to power them up.
I'm not so sure the motherboard is broken, I am leaning more towards a
misconfigured
bridge. This computer is a 4U 19 inch rackmount chassis with a PCMIG
CPU and a 12 slot PCI backplane. I have done a lot of testing with this
box trying to characterize this problem. In one case I have put 3 Intel
PRO 100S NICs on the secondary PCI bus and they ran under heavy stress
test loads overnight. The 4 port NIC seems to be the only card that
doesnt want to cooperate.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-14 23:49 ` Krzysztof Halasa
2007-07-15 1:27 ` William Montgomery
@ 2007-07-17 18:29 ` William Montgomery
2007-07-17 18:55 ` Kok, Auke
2007-07-17 20:54 ` Krzysztof Halasa
1 sibling, 2 replies; 14+ messages in thread
From: William Montgomery @ 2007-07-17 18:29 UTC (permalink / raw)
To: Krzysztof Halasa; +Cc: linux-kernel
Krzysztof Halasa wrote:
>William Montgomery <william@opinicus.com> writes:
>
>
>
>>I am using a PCI analyzer and it shows the bus in an idle state after
>>the lockup. The PCI transactions just prior to the lockup show a
>>couple of interrupts from the card which appear to be handled
>>correctly. Anything I should be looking for in particular?
>>
>>
>
>I'd try to check with other machine using "secondary" bus slot.
>BTW: Are you able to analyze the "primary" bus transactions while
>using the card in "secondary" bus? Perhaps there is something
>wrong in front of the motherboard bridge?
>
>
>
I am able to analyze the primary bus while the using the card in the
secondary and I see a very interesting thing on lockup - the primary
side appears to be stuck on a read access to the memory mapped control
regs of the LAN chip (82559) in what appears to be infinite target
retries to the same address. Unfortunately I havent been able to
capture what occurs just prior to this happening. This is quite
different from what I capture on the secondary side; which is an idle bus
I have posted the lspci -vv listing below...
>A broken motherboard may be hard to diagnose, unfortunately.
>
>Can you post something like "lspci -vv" taken on both machines?
>
>
Here is the lspci -vv on the machine with lockups (edited for brevity):
00:00.0 Host bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
Controller/Ho
Subsystem: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
Controller/Host
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort-
Latency: 0
Region 0: Memory at f0000000 (32-bit, prefetchable) [size=64M]
Capabilities: [e4] #09 [1105]
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI
Bridge (rev 82) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR+
Latency: 0
Bus: primary=00, secondary=01, subordinate=03, sec-latency=32
I/O behind bridge: 00009000-0000afff
Memory behind bridge: f4000000-f6ffffff
Prefetchable memory behind bridge: 10000000-103fffff
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
01:0c.0 PCI bridge: Pericom Semiconductor: Unknown device 8150 (rev 02)
(prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32, cache line size 08
Bus: primary=01, secondary=02, subordinate=03, sec-latency=32
I/O behind bridge: 00009000-00009fff
Memory behind bridge: f4000000-f5ffffff
Prefetchable memory behind bridge: 0000000010000000-0000000010300000
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
Capabilities: [dc] Power Management version 1
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] Slot ID: 0 slots, First-, chassis 00
02:06.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge
(non-transparent mode) (rev 15) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32, cache line size 08
Bus: primary=02, secondary=03, subordinate=03, sec-latency=32
I/O behind bridge: 00009000-00009fff
Memory behind bridge: f4000000-f5ffffff
Prefetchable memory behind bridge: 0000000010000000-0000000010300000
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [90] #06 [0080]
Capabilities: [a0] Vital Product Data03:04.0 Ethernet
controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
Subsystem: Intel Corp. EtherExpress PRO/100+ Management Adapter
with Alert On LAN*
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 18
Region 0: Memory at f5403000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at 9000 [size=64]
Region 2: Memory at f5000000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 10000000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
03:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
Subsystem: Intel Corp. EtherExpress PRO/100+ Management Adapter
with Alert On LAN*
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 19
Region 0: Memory at f5401000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at 9400 [size=64]
Region 2: Memory at f5100000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 10100000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
03:06.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
Subsystem: Intel Corp. EtherExpress PRO/100+ Management Adapter
with Alert On LAN*
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 16
Region 0: Memory at f5400000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at 9800 [size=64]
Region 2: Memory at f5200000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 10200000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
03:07.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 08)
Subsystem: Intel Corp. EtherExpress PRO/100+ Management Adapter
with Alert On LAN*
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 17
Region 0: Memory at f5402000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at 9c00 [size=64]
Region 2: Memory at f5300000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 10300000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
======================
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-17 18:29 ` William Montgomery
@ 2007-07-17 18:55 ` Kok, Auke
2007-07-17 19:37 ` William Montgomery
2007-07-17 21:04 ` Krzysztof Halasa
2007-07-17 20:54 ` Krzysztof Halasa
1 sibling, 2 replies; 14+ messages in thread
From: Kok, Auke @ 2007-07-17 18:55 UTC (permalink / raw)
To: William Montgomery; +Cc: Krzysztof Halasa, linux-kernel
William Montgomery wrote:
> Krzysztof Halasa wrote:
>
>> William Montgomery <william@opinicus.com> writes:
>>
>>
>>
>>> I am using a PCI analyzer and it shows the bus in an idle state after
>>> the lockup. The PCI transactions just prior to the lockup show a
>>> couple of interrupts from the card which appear to be handled
>>> correctly. Anything I should be looking for in particular?
>>>
>>>
>> I'd try to check with other machine using "secondary" bus slot.
>> BTW: Are you able to analyze the "primary" bus transactions while
>> using the card in "secondary" bus? Perhaps there is something
>> wrong in front of the motherboard bridge?
>>
>>
>>
> I am able to analyze the primary bus while the using the card in the
> secondary and I see a very interesting thing on lockup - the primary
> side appears to be stuck on a read access to the memory mapped control
> regs of the LAN chip (82559) in what appears to be infinite target
> retries to the same address. Unfortunately I havent been able to
> capture what occurs just prior to this happening. This is quite
> different from what I capture on the secondary side; which is an idle bus
>
> I have posted the lspci -vv listing below...
>
>> A broken motherboard may be hard to diagnose, unfortunately.
>>
>> Can you post something like "lspci -vv" taken on both machines?
>>
>>
> Here is the lspci -vv on the machine with lockups (edited for brevity):
>
> 00:00.0 Host bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
> Controller/Ho
> Subsystem: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
> Controller/Host
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Step
> Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
> <TAbort-
> Latency: 0
> Region 0: Memory at f0000000 (32-bit, prefetchable) [size=64M]
> Capabilities: [e4] #09 [1105]
> 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI
> Bridge (rev 82) (prog-if 00 [Normal decode])
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B-
> Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR+
PERR+ set... not good - this certainly will cause major issues
Auke
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-17 18:55 ` Kok, Auke
@ 2007-07-17 19:37 ` William Montgomery
2007-07-17 21:04 ` Krzysztof Halasa
1 sibling, 0 replies; 14+ messages in thread
From: William Montgomery @ 2007-07-17 19:37 UTC (permalink / raw)
To: Kok, Auke; +Cc: Krzysztof Halasa, linux-kernel
Kok, Auke wrote:
>> 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to
>> PCI Bridge (rev 82) (prog-if 00 [Normal decode])
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR+ FastB2B-
>> Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast
>> >TAbort- <TAbort- <MAbort- >SERR- <PERR+
>
>
> PERR+ set... not good - this certainly will cause major issues
>
I know it sounds that way based on the definition (Detected parity error
on hub side), however I have two other identical systems that have been
running fine for months - with this same bit set - only they use an
Adaptec ANA-64044 (4 port card - 10/100 fast ethernet - unfortunately
discontinued).
It seems the Pericom PCI to PCI bridge is having a problem talking to
the LAN controllers behind the Hint bridge.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-17 18:29 ` William Montgomery
2007-07-17 18:55 ` Kok, Auke
@ 2007-07-17 20:54 ` Krzysztof Halasa
1 sibling, 0 replies; 14+ messages in thread
From: Krzysztof Halasa @ 2007-07-17 20:54 UTC (permalink / raw)
To: William Montgomery; +Cc: linux-kernel
William Montgomery <william@opinicus.com> writes:
> I am able to analyze the primary bus while the using the card in the
> secondary and I see a very interesting thing on lockup - the primary
> side appears to be stuck on a read access to the memory mapped control
> regs of the LAN chip (82559) in what appears to be infinite target
> retries to the same address. Unfortunately I havent been able to
> capture what occurs just prior to this happening. This is quite
> different from what I capture on the secondary side; which is an idle
> bus
Seems like bridge problem, doesn't it?
I wonder if the infinite retry is the same register every time?
Could it be a deadlock generated by/in the bridge? I'd look at
the bridge specs and maybe updates, perhaps they have some hints.
> 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to
> PCI Bridge (rev 82) (prog-if 00 [Normal decode])
> Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR+
^^^^^^
> Bus: primary=00, secondary=01, subordinate=03, sec-latency=32
I wonder why PERR is set and which device on bus #0 or #1 causes it?
> 01:0c.0 PCI bridge: Pericom Semiconductor: Unknown device 8150 (rev
> 02) (prog-if 00 [Normal decode])
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B-
> Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium
>>TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> Bus: primary=01, secondary=02, subordinate=03, sec-latency=32
It seems PERR on bus #0 isn't generated by this bridge, at least
it doesn't signal that in its status. Who knows, it may be unrelated.
Have you tried to perform the same tests on the other machine?
--
Krzysztof Halasa
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: e100 PCI bridge problem
2007-07-17 18:55 ` Kok, Auke
2007-07-17 19:37 ` William Montgomery
@ 2007-07-17 21:04 ` Krzysztof Halasa
1 sibling, 0 replies; 14+ messages in thread
From: Krzysztof Halasa @ 2007-07-17 21:04 UTC (permalink / raw)
To: Kok, Auke; +Cc: William Montgomery, linux-kernel
"Kok, Auke" <auke-jan.h.kok@intel.com> writes:
> PERR+ set... not good - this certainly will cause major issues
Unfortunately some devices assert PERR without a good reason,
and it may do no special harm.
Should be handled and cleared, probably.
--
Krzysztof Halasa
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-07-17 21:04 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-13 17:37 e100 PCI bridge problem William Montgomery
2007-07-13 20:36 ` Kok, Auke
2007-07-13 22:30 ` William Montgomery
2007-07-13 22:41 ` Kok, Auke
2007-07-14 0:54 ` William Montgomery
2007-07-14 14:43 ` Krzysztof Halasa
2007-07-14 23:17 ` William Montgomery
2007-07-14 23:49 ` Krzysztof Halasa
2007-07-15 1:27 ` William Montgomery
2007-07-17 18:29 ` William Montgomery
2007-07-17 18:55 ` Kok, Auke
2007-07-17 19:37 ` William Montgomery
2007-07-17 21:04 ` Krzysztof Halasa
2007-07-17 20:54 ` Krzysztof Halasa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox