* sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 @ 2009-10-03 5:10 Bernie Innocenti 2009-10-05 21:45 ` Mark Lord 0 siblings, 1 reply; 16+ messages in thread From: Bernie Innocenti @ 2009-10-03 5:10 UTC (permalink / raw) To: linux-ide; +Cc: lkml, sysadmin The error in the subject appears in the console immediately followed bv a hard freeze of the machine. The error occurs reproducibly on two identical Opteron servers, each one equipped with two identical controller cards: 03:04.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) 03:06.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) We can trigger the problem within a few seconds by starting a reconstruction on a drive hooked to port 4 (counting from 0) of the second controller. Oddly, every other drive works reliably and the faulty drive works if we connect it to, for example, port 4 of the first controller. I'd like to stress that the problem occurs systematically, on two completely distinct machines. We swapped drives, cables and controllers to exclude other possibilities. Tested with Debian kernels 2.6.26-19 and 2.6.30-8. Let me know if further details are needed. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-03 5:10 sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 Bernie Innocenti @ 2009-10-05 21:45 ` Mark Lord 2009-10-06 4:16 ` Bernie Innocenti 2009-10-06 12:25 ` Harri Olin 0 siblings, 2 replies; 16+ messages in thread From: Mark Lord @ 2009-10-05 21:45 UTC (permalink / raw) To: Bernie Innocenti; +Cc: linux-ide, lkml, sysadmin Bernie Innocenti wrote: > The error in the subject appears in the console immediately followed bv > a hard freeze of the machine. The error occurs reproducibly on two > identical Opteron servers, each one equipped with two identical > controller cards: > > 03:04.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) > 03:06.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) > > We can trigger the problem within a few seconds by starting a > reconstruction on a drive hooked to port 4 (counting from 0) of the > second controller. Oddly, every other drive works reliably and the > faulty drive works if we connect it to, for example, port 4 of the first > controller. > > I'd like to stress that the problem occurs systematically, on two > completely distinct machines. We swapped drives, cables and controllers > to exclude other possibilities. > > Tested with Debian kernels 2.6.26-19 and 2.6.30-8. Let me know if > further details are needed. .. > 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040.. .. 0x30000040 here means "MRdPerr": "bad data parity detected during PCI master read". Which means there that a data parity error happened during outgoing data transfer on the PCI-X bus. This could happen due to noise on the bus, dying capacitors, or (?) bad RAM (not sure about the last one). The expected behaviour here is for sata_mv to then perform perform a full SATA reset, after which the I/O will be reattempted. But it appears to lock up before that happens. The code does try and clear the PCI error interrupt, but perhaps it needs clearing in more than the one register where it currently does so. Looking over the code and the documentation I have (NDA), nothing obvious springs to view. There are some extra registers we could be dumping out, to show exactly what PCI phase and address caused the error, but reading those won't cause or prevent a lockup. Best bet would be to try replacing the RAM in that box, and see if the problem goes away. Cheers ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-05 21:45 ` Mark Lord @ 2009-10-06 4:16 ` Bernie Innocenti 2009-10-06 12:25 ` Harri Olin 1 sibling, 0 replies; 16+ messages in thread From: Bernie Innocenti @ 2009-10-06 4:16 UTC (permalink / raw) To: Mark Lord; +Cc: linux-ide, lkml, sysadmin El Mon, 05-10-2009 a las 17:45 -0400, Mark Lord escribió: > 0x30000040 here means "MRdPerr": > "bad data parity detected during PCI master read". > > Which means there that a data parity error happened > during outgoing data transfer on the PCI-X bus. > This could happen due to noise on the bus, > dying capacitors, or (?) bad RAM (not sure about the last one). Oddly, we see this on two different machines. And only on specific ports of the second controller card. On one of these machines, we've also found a bunch of MCEs related to ECC errors, but we were unable to reproduce them by exercising the CPU and the bus with tools like cpuburn or md5sum of entire drives. The other one has been running for 2 days with no errors whatsoever. Bother have successfully completed a 24h cycle of memtest86+. > The expected behaviour here is for sata_mv to then perform > perform a full SATA reset, after which the I/O will be reattempted. > > But it appears to lock up before that happens. > The code does try and clear the PCI error interrupt, > but perhaps it needs clearing in more than the one register > where it currently does so. I've got a few of these recoverable errors overnight (perhaps along with the MCE errors I described above). The bus was reset as you describe. The PCI errors seem to cause a system freeze only during RAID reconstruction. Perhaps the bus reset logic is not sufficiently locked against re-entrance? > Looking over the code and the documentation I have (NDA), > nothing obvious springs to view. There are some extra registers > we could be dumping out, to show exactly what PCI phase and address > caused the error, but reading those won't cause or prevent a lockup. > > Best bet would be to try replacing the RAM in that box, > and see if the problem goes away. We'll try this tomorrow, thank you very much for providing these clues. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-05 21:45 ` Mark Lord 2009-10-06 4:16 ` Bernie Innocenti @ 2009-10-06 12:25 ` Harri Olin 2009-10-06 18:04 ` Bernie Innocenti 1 sibling, 1 reply; 16+ messages in thread From: Harri Olin @ 2009-10-06 12:25 UTC (permalink / raw) To: Mark Lord; +Cc: Bernie Innocenti, linux-ide, lkml, sysadmin Mark Lord wrote: > Bernie Innocenti wrote: >> The error in the subject appears in the console immediately followed bv >> a hard freeze of the machine. The error occurs reproducibly on two >> identical Opteron servers, each one equipped with two identical >> controller cards: >> >> 03:04.0 SCSI storage controller: Marvell Technology Group Ltd. >> MV88SX6081 8-port SATA II PCI-X Controller (rev 09) >> 03:06.0 SCSI storage controller: Marvell Technology Group Ltd. >> MV88SX6081 8-port SATA II PCI-X Controller (rev 09) >> >> We can trigger the problem within a few seconds by starting a >> reconstruction on a drive hooked to port 4 (counting from 0) of the >> second controller. Oddly, every other drive works reliably and the >> faulty drive works if we connect it to, for example, port 4 of the first >> controller. >> >> Tested with Debian kernels 2.6.26-19 and 2.6.30-8. Let me know if >> further details are needed. > .. >> 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040.. > .. > > 0x30000040 here means "MRdPerr": > "bad data parity detected during PCI master read". > > Which means there that a data parity error happened > during outgoing data transfer on the PCI-X bus. > This could happen due to noise on the bus, > dying capacitors, or (?) bad RAM (not sure about the last one). > I have heard same thing happened with same kind of configuration, using Supermicro H8DME-2 motherboard, Opteron 2378 CPU. Even the controllers were on same slots. My initial suspicion was that the motherboard does not drop the PCI-X bus frequency to 100MHz and drives the bus at 133MHz even though there are 2 controllers connected. Proposed fix was to move the other controller to other bus, as the H8DME-2 has four PCI-X slots, 2x100MHz and 2x133MHz, but I haven't yet heard back if it helped. Even the kernel was same - latest Debian distribution kernel. Might be worthwile to try using vanilla kernel.org kernel if possible. I have at home two 6081 controllers at same bus but at 100MHz and no problems yet. -- Harri. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-06 12:25 ` Harri Olin @ 2009-10-06 18:04 ` Bernie Innocenti 2009-10-06 20:06 ` Mark Lord 2009-10-08 16:26 ` Bernie Innocenti 0 siblings, 2 replies; 16+ messages in thread From: Bernie Innocenti @ 2009-10-06 18:04 UTC (permalink / raw) To: Harri Olin; +Cc: Mark Lord, linux-ide, lkml, sysadmin El Tue, 06-10-2009 a las 15:25 +0300, Harri Olin escribió: > Mark Lord wrote: > > Bernie Innocenti wrote: > >> The error in the subject appears in the console immediately followed bv > >> a hard freeze of the machine. The error occurs reproducibly on two > >> identical Opteron servers, each one equipped with two identical > >> controller cards: > >> > >> 03:04.0 SCSI storage controller: Marvell Technology Group Ltd. > >> MV88SX6081 8-port SATA II PCI-X Controller (rev 09) > >> 03:06.0 SCSI storage controller: Marvell Technology Group Ltd. > >> MV88SX6081 8-port SATA II PCI-X Controller (rev 09) > >> > >> We can trigger the problem within a few seconds by starting a > >> reconstruction on a drive hooked to port 4 (counting from 0) of the > >> second controller. Oddly, every other drive works reliably and the > >> faulty drive works if we connect it to, for example, port 4 of the first > >> controller. > >> > >> Tested with Debian kernels 2.6.26-19 and 2.6.30-8. Let me know if > >> further details are needed. > > .. > >> 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040.. > > .. > > > > 0x30000040 here means "MRdPerr": > > "bad data parity detected during PCI master read". > > > > Which means there that a data parity error happened > > during outgoing data transfer on the PCI-X bus. > > This could happen due to noise on the bus, > > dying capacitors, or (?) bad RAM (not sure about the last one). > > > I have heard same thing happened with same kind of configuration, using > Supermicro H8DME-2 motherboard, Opteron 2378 CPU. > >Even the controllers were on same slots. Close. Mine is a Supermicro H8DM8-2 with 2x Opteron 2374 HE CPU. > My initial suspicion was that the motherboard does not drop the PCI-X > bus frequency to 100MHz and drives the bus at 133MHz even though there > are 2 controllers connected. Proposed fix was to move the other > controller to other bus, as the H8DME-2 has four PCI-X slots, 2x100MHz > and 2x133MHz, but I haven't yet heard back if it helped. Thanks for this hint, I'll try this tomorrow, > Even the kernel was same - latest Debian distribution kernel. Might be > worthwile to try using vanilla kernel.org kernel if possible. As a matter of fact, yesterday I tried booting off an Open Solaris Nexenta CD and I couldn't reproduce the issue, although I couldn't reproduce the exact same conditions that trigger the bug systematically on Linux. > I have at home two 6081 controllers at same bus but at 100MHz and no > problems yet. Is there a way to find out what the current PCI-X bus frequency is from Linux? And from the BIOS? -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-06 18:04 ` Bernie Innocenti @ 2009-10-06 20:06 ` Mark Lord 2009-10-07 0:06 ` Bernie Innocenti 2009-10-08 16:26 ` Bernie Innocenti 1 sibling, 1 reply; 16+ messages in thread From: Mark Lord @ 2009-10-06 20:06 UTC (permalink / raw) To: Bernie Innocenti; +Cc: Harri Olin, linux-ide, lkml, sysadmin If you could also send me the output of "lspci -vv" for the cards then I can also have a quick look at chipset errata for possibilities. The early revs of these chips did have a number of errata specific to PCI-X. Cheers ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-06 20:06 ` Mark Lord @ 2009-10-07 0:06 ` Bernie Innocenti 2009-10-07 1:40 ` Bernie Innocenti 0 siblings, 1 reply; 16+ messages in thread From: Bernie Innocenti @ 2009-10-07 0:06 UTC (permalink / raw) To: Mark Lord; +Cc: Harri Olin, linux-ide, lkml, sysadmin El Tue, 06-10-2009 a las 16:06 -0400, Mark Lord escribió: > If you could also send me the output of "lspci -vv" for the cards > then I can also have a quick look at chipset errata for possibilities. See below. Looking at the Status field, is it correct to say that the cards are definitely running at 133MHz? Is there a way to force them to a different speed from Linux or from the BIOS? > The early revs of these chips did have a number of errata specific to PCI-X. I checked the revision (09) against the sata_mv source and I couldn't spot anything relevant to us. 03:04.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) Subsystem: Marvell Technology Group Ltd. Device 11ab Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 Region 0: Memory at feb00000 (64-bit, non-prefetchable) [size=1M] Region 2: I/O ports at e800 [size=256] Region 3: [virtual] Memory at fdc00000 (32-bit, non-prefetchable) [size=4M] [virtual] Expansion ROM at fd800000 [disabled] [size=4M] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 Capabilities: [60] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=4 Status: Dev=03:04.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz- Kernel driver in use: sata_mv Kernel modules: sata_mv 03:06.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09) Subsystem: Marvell Technology Group Ltd. Device 11ab Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 18 Region 0: Memory at fea00000 (64-bit, non-prefetchable) [size=1M] Region 2: I/O ports at e400 [size=256] Region 3: [virtual] Memory at fd400000 (32-bit, non-prefetchable) [size=4M] [virtual] Expansion ROM at fd000000 [disabled] [size=4M] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 Capabilities: [60] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=4 Status: Dev=03:06.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz- Kernel driver in use: sata_mv Kernel modules: sata_mv -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-07 0:06 ` Bernie Innocenti @ 2009-10-07 1:40 ` Bernie Innocenti 2009-10-07 3:13 ` Mark Lord 0 siblings, 1 reply; 16+ messages in thread From: Bernie Innocenti @ 2009-10-07 1:40 UTC (permalink / raw) To: Mark Lord; +Cc: Harri Olin, linux-ide, lkml, sysadmin El Tue, 06-10-2009 a las 20:06 -0400, Bernie Innocenti escribió: > > The early revs of these chips did have a number of errata specific to PCI-X. > > I checked the revision (09) against the sata_mv source and I couldn't > spot anything relevant to us. NEWSFLASH: today we replaced the 4x500GB Seagate drives with 4x1.5TB drives and reconstruction of the array has been running for 2h without a glitch. One interesting difference is that the 500GB drives were being configured in 1.5Gbps SATA mode. Another notable difference is the sequential read speed: ~70MB/s vs ~130MB/s with the 1.5TB model. Could the PCI bus errors be a red herring? -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-07 1:40 ` Bernie Innocenti @ 2009-10-07 3:13 ` Mark Lord 2009-10-08 16:42 ` Bernie Innocenti 0 siblings, 1 reply; 16+ messages in thread From: Mark Lord @ 2009-10-07 3:13 UTC (permalink / raw) To: Bernie Innocenti; +Cc: Harri Olin, linux-ide, lkml, sysadmin Bernie Innocenti wrote: > El Tue, 06-10-2009 a las 20:06 -0400, Bernie Innocenti escribió: >>> The early revs of these chips did have a number of errata specific to PCI-X. >> I checked the revision (09) against the sata_mv source and I couldn't >> spot anything relevant to us. > > NEWSFLASH: today we replaced the 4x500GB Seagate drives with 4x1.5TB > drives and reconstruction of the array has been running for 2h without a > glitch. > > One interesting difference is that the 500GB drives were being > configured in 1.5Gbps SATA mode. Another notable difference is the > sequential read speed: ~70MB/s vs ~130MB/s with the 1.5TB model. > > Could the PCI bus errors be a red herring? .. Dunno. Rev.9 == "C0" in Marvell terminology, and that's the latest/final rev for the 6081 chip, with most of the PCI-X bugs fixed or worked around. So not much to go on there. The Bus error report was real, though. But with 3.0gb/sec sata connections, the chip will be using some different internal clocks and timings, which could be enough to avoid triggering the PCI errors. I guess. Let's hope so, anyway. Cheers ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-07 3:13 ` Mark Lord @ 2009-10-08 16:42 ` Bernie Innocenti 2009-10-08 17:09 ` Tony Vroon 2009-10-09 3:07 ` Mark Lord 0 siblings, 2 replies; 16+ messages in thread From: Bernie Innocenti @ 2009-10-08 16:42 UTC (permalink / raw) To: Mark Lord; +Cc: Harri Olin, linux-ide, lkml, sysadmin El Tue, 06-10-2009 a las 23:13 -0400, Mark Lord escribió: > Dunno. Rev.9 == "C0" in Marvell terminology, > and that's the latest/final rev for the 6081 chip, > with most of the PCI-X bugs fixed or worked around. > So not much to go on there. > > The Bus error report was real, though. > But with 3.0gb/sec sata connections, the chip will be > using some different internal clocks and timings, > which could be enough to avoid triggering the PCI errors. > > I guess. Let's hope so, anyway. Our prayers have not been answered :-( I tried several things: - Forcing all the 500GB Seagate drives to 3.0Gbps does not help - Replacing the 500GB drives with 1.5TB drives seems to make the PCI error much less frequent - Moving the controllers to different slots (on different busses) does not help - Happens with both 2.6.26 (from lenny) and 2.6.30 (from sid) - Unplugging one of the controllers appeared to lead to a stable configuration, but yesterday I left the machines reconstructing the arrays and this mornings one of them is not answering to pings ;-( I want to try reducing the frequency of the PCI-X bus, but the BIOS does not seem to provide a setting for it. Is there another way? -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-08 16:42 ` Bernie Innocenti @ 2009-10-08 17:09 ` Tony Vroon 2009-10-14 15:24 ` [SOLVED] " Bernie Innocenti 2009-10-09 3:07 ` Mark Lord 1 sibling, 1 reply; 16+ messages in thread From: Tony Vroon @ 2009-10-08 17:09 UTC (permalink / raw) To: Bernie Innocenti; +Cc: Mark Lord, Harri Olin, linux-ide, lkml, sysadmin [-- Attachment #1: Type: text/plain, Size: 760 bytes --] On Thu, 2009-10-08 at 12:42 -0400, Bernie Innocenti wrote: > El Tue, 06-10-2009 a las 23:13 -0400, Mark Lord escribió: > I want to try reducing the frequency of the PCI-X bus, but the BIOS does > not seem to provide a setting for it. Is there another way? Generally this is done with a physical jumper on the board instead. You'll find it near to the bridge chip, which is almost always by NEC. Another technique to slow the bridge down is to insert a regular PCI card in the other slot (these bridges tend to offer 2 or 3 slots). As the weakest link, it'll drag everything down to 33MHz. An old PCI-X 66MHz-only card may prove helpful here as well. You don't have to drive it in any way; getting power to it is sufficient. Regards, Tony V. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* [SOLVED] Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-08 17:09 ` Tony Vroon @ 2009-10-14 15:24 ` Bernie Innocenti 0 siblings, 0 replies; 16+ messages in thread From: Bernie Innocenti @ 2009-10-14 15:24 UTC (permalink / raw) To: Tony Vroon; +Cc: Mark Lord, Harri Olin, linux-ide, lkml, sysadmin El Thu, 08-10-2009 a las 18:09 +0100, Tony Vroon escribió: > On Thu, 2009-10-08 at 12:42 -0400, Bernie Innocenti wrote: > > El Tue, 06-10-2009 a las 23:13 -0400, Mark Lord escribió: > > I want to try reducing the frequency of the PCI-X bus, but the BIOS does > > not seem to provide a setting for it. Is there another way? > > Generally this is done with a physical jumper on the board instead. > You'll find it near to the bridge chip, which is almost always by NEC. > Another technique to slow the bridge down is to insert a regular PCI > card in the other slot (these bridges tend to offer 2 or 3 slots). As > the weakest link, it'll drag everything down to 33MHz. > An old PCI-X 66MHz-only card may prove helpful here as well. You don't > have to drive it in any way; getting power to it is sufficient. Hurray! It seems we've fixed our stability issue at last. We forced the bus speed down to PCI-X 66MHz for both buses by shorting pins 1-2 of the on-board jumpers. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-08 16:42 ` Bernie Innocenti 2009-10-08 17:09 ` Tony Vroon @ 2009-10-09 3:07 ` Mark Lord 2009-10-09 3:16 ` Mark Lord 1 sibling, 1 reply; 16+ messages in thread From: Mark Lord @ 2009-10-09 3:07 UTC (permalink / raw) To: Bernie Innocenti; +Cc: Harri Olin, linux-ide, lkml, sysadmin Bernie Innocenti wrote: > > I want to try reducing the frequency of the PCI-X bus, but the BIOS does > not seem to provide a setting for it. Is there another way? .. Nothing that's easy. Here.. apply this patch, and post the output after you reboot with it. --- 2.6.31/drivers/ata/sata_mv.c.orig 2009-08-21 22:16:05.000000000 -0400 +++ linux/drivers/ata/sata_mv.c 2009-10-08 23:05:37.392203506 -0400 @@ -3738,6 +3738,12 @@ hp_flags |= MV_HP_ERRATA_60X1B2; break; case 0x9: + { + struct mv_host_priv *hpriv = host->private_data; + void __iomem *mmio = hpriv->base; + printk(KERN_INFO "sata_mv: pcix_mode=%d\n", mv_in_pcix_mode(host)); + printk(KERN_INFO "sata_mv: MV_PCI_COMMAND=%08x\n", readl(mmio + MV_PCI_COMMAND); + } hp_flags |= MV_HP_ERRATA_60X1C0; break; default: ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-09 3:07 ` Mark Lord @ 2009-10-09 3:16 ` Mark Lord 0 siblings, 0 replies; 16+ messages in thread From: Mark Lord @ 2009-10-09 3:16 UTC (permalink / raw) To: Bernie Innocenti; +Cc: Harri Olin, linux-ide, lkml, sysadmin Mark Lord wrote: > Bernie Innocenti wrote: >> >> I want to try reducing the frequency of the PCI-X bus, but the BIOS does >> not seem to provide a setting for it. Is there another way? > .. > > Nothing that's easy. .. Adding to that: there is a register on the chip, which software could use to override the normal auto-detected PCI mode (bus speed) for the chip. This could be used to, say, select 100Mhz or 66Mhz, or even 33Mhz operation. BUT.. the register is autodetected from the bus at power-on, and so if software wants to override that (by rewriting the reg), it will also need to reset the PCI bus afterward. Which requires knowing how to reset a PCI bridge, something I don't know about. Cheers ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-06 18:04 ` Bernie Innocenti 2009-10-06 20:06 ` Mark Lord @ 2009-10-08 16:26 ` Bernie Innocenti 2009-10-08 21:51 ` Harri Olin 1 sibling, 1 reply; 16+ messages in thread From: Bernie Innocenti @ 2009-10-08 16:26 UTC (permalink / raw) To: Harri Olin; +Cc: Mark Lord, linux-ide, lkml, sysadmin El Tue, 06-10-2009 a las 14:04 -0400, Bernie Innocenti escribió: > > I have heard same thing happened with same kind of configuration, using > > Supermicro H8DME-2 motherboard, Opteron 2378 CPU. > > > >Even the controllers were on same slots. > > Close. Mine is a Supermicro H8DM8-2 with 2x Opteron 2374 HE CPU. I was wrong (the BIOS DMI block is wrong). The motherboard is labeled as H8DME-2. -- // Bernie Innocenti - http://codewiz.org/ \X/ Sugar Labs - http://sugarlabs.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 2009-10-08 16:26 ` Bernie Innocenti @ 2009-10-08 21:51 ` Harri Olin 0 siblings, 0 replies; 16+ messages in thread From: Harri Olin @ 2009-10-08 21:51 UTC (permalink / raw) To: Bernie Innocenti; +Cc: Mark Lord, linux-ide, lkml, sysadmin Bernie Innocenti kirjoitti: > El Tue, 06-10-2009 a las 14:04 -0400, Bernie Innocenti escribió: >>> I have heard same thing happened with same kind of configuration, using >>> Supermicro H8DME-2 motherboard, Opteron 2378 CPU. >>> >>> Even the controllers were on same slots. >> Close. Mine is a Supermicro H8DM8-2 with 2x Opteron 2374 HE CPU. > > I was wrong (the BIOS DMI block is wrong). The motherboard is labeled > as H8DME-2. > H8DME-2 is the same board as H8DM8-2, just without scsi controller. There is 2 3-pin jumpers somewhere between pci-x slots, one for each bus. With these you can force the bus to 66MHz PCI or 66MHz PCI-X. Without jumper means autodetect. Note that this information is only from manual, haven't been able to confirm what it really does :) Oh and on the other identical case, I heard that moving other controller to different bus (1st controller in top slot and 2nd controller in 2nd slot from bottom) resolved the issue, or at least it has not error'd yet. -- Harri. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2009-10-14 15:25 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-03 5:10 sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040 Bernie Innocenti 2009-10-05 21:45 ` Mark Lord 2009-10-06 4:16 ` Bernie Innocenti 2009-10-06 12:25 ` Harri Olin 2009-10-06 18:04 ` Bernie Innocenti 2009-10-06 20:06 ` Mark Lord 2009-10-07 0:06 ` Bernie Innocenti 2009-10-07 1:40 ` Bernie Innocenti 2009-10-07 3:13 ` Mark Lord 2009-10-08 16:42 ` Bernie Innocenti 2009-10-08 17:09 ` Tony Vroon 2009-10-14 15:24 ` [SOLVED] " Bernie Innocenti 2009-10-09 3:07 ` Mark Lord 2009-10-09 3:16 ` Mark Lord 2009-10-08 16:26 ` Bernie Innocenti 2009-10-08 21:51 ` Harri Olin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox