* FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux @ 2008-11-13 21:21 Linda Walsh 2008-11-16 6:04 ` Tejun Heo 0 siblings, 1 reply; 31+ messages in thread From: Linda Walsh @ 2008-11-13 21:21 UTC (permalink / raw) To: LKML, Smartmontools Mailing List, linux-ide FYI -- ever since I switched to using SATA, I've not had a stable kernel. Sys uptime went from near infinite (striking planned take downs), to less than a week consistently. I'd been using the Promise 300 TX4 with 1-2 Seagate drives. (PDC40718, rev 02). Finally an explicit problem regarding that controller under Linux, with it timing out a drive returning from suspend during 'SMART' operations, got a suggestions from the community (Tnx, Tejun Heo) to try a _cheaper_ but better featured Silicon Image controller (SiI 3124 Sata). Not only did it NOT have the SMART problem (that would hang the drive or machine), but my random hangs seem to have gone away. My main server has been up nearly 21 days now on 2.6.27-3 SMP (vanilla-i386). I'd had problems with the ranging in kernels going back to 2.6.24 or so when I had first tried adding SATA to the system. So Tnx again to Tejun -- and NOTE: the card or driver (or both) for the Promise 300 TX4 isn't stable for production use -- and has a repeatable problem of timing out some drives before it can spin-up from standby (just the drive -- not the computer). The error logically removes the drive from the system until the next boot (unplugging, and replugging in the SATA cable on the drive would hang the machine within 5 seconds of replugging in the cable). Not an instant, hang as might indicated a HW upset plugging in cable, but a couple second delay after plugin -- before keyboard would lock up -- pointing toward the software trying to re-add+initialize the drive. Needless to say, I'm only using the Sil controller now, and things are stable. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-13 21:21 FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux Linda Walsh @ 2008-11-16 6:04 ` Tejun Heo 2008-11-16 11:08 ` Mikael Pettersson 0 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2008-11-16 6:04 UTC (permalink / raw) To: Linda Walsh Cc: LKML, Smartmontools Mailing List, linux-ide, Mikael Pettersson (cc'ing Mikael Pettersson) Hello, Linda. Linda Walsh wrote: > FYI -- ever since I switched to using SATA, I've not had a stable kernel. > Sys uptime went from near infinite (striking planned take downs), to less > than a week consistently. I'd been using the Promise 300 TX4 with 1-2 > Seagate drives. (PDC40718, rev 02). > > Finally an explicit problem regarding that controller under Linux, with it > timing out a drive returning from suspend during 'SMART' operations, got a > suggestions from the community (Tnx, Tejun Heo) to try a _cheaper_ but > better featured Silicon Image controller (SiI 3124 Sata). Yeah, I'm quite fond of the controller. Except for the bandwidth limit due to limited number of postable requests which shows up only when multiple drives are attached to a single port via PMP, I can't think of anything bad about it. > Not only did it NOT have the SMART problem (that would hang the drive or > machine), but my random hangs seem to have gone away. > > My main server has been up nearly 21 days now on 2.6.27-3 SMP > (vanilla-i386). > > I'd had problems with the ranging in kernels going back to 2.6.24 or so > when I had first tried adding SATA to the system. > > So Tnx again to Tejun -- > > and NOTE: the card or driver (or both) for the Promise 300 TX4 isn't > stable for production use -- and has a repeatable problem of timing out > some drives before it can spin-up from standby (just the drive -- not the > computer). The error logically removes the drive from the system until > the next boot (unplugging, and replugging in the SATA cable on the drive > would hang the machine within 5 seconds of replugging in the cable). Not > an instant, hang as might indicated a HW upset plugging in cable, but a > couple second delay after plugin -- before keyboard would lock up -- > pointing toward the software trying to re-add+initialize the drive. Some promise controllers seem to suffer transmission problems when combined with certain drives, which often show up as timeouts. The hardreset of sata_promise wasn't as robust as it should have been and in some cases it wasn't able to recover a link after error condition causing the system to lose drive after such events. The hardreset problem was fixed recently by Mikael Pettersson. Can you please try 2.6.28-rc5 and see whether sata_promise still loses drives after failures? Mikael, I think the hardreset fix is worthy including into -stable. It should be safe for -stable too, right? Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 6:04 ` Tejun Heo @ 2008-11-16 11:08 ` Mikael Pettersson 2008-11-16 14:24 ` Tejun Heo ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Mikael Pettersson @ 2008-11-16 11:08 UTC (permalink / raw) To: Tejun Heo Cc: Linda Walsh, LKML, Smartmontools Mailing List, linux-ide, Mikael Pettersson Tejun Heo writes: > > and NOTE: the card or driver (or both) for the Promise 300 TX4 isn't > > stable for production use -- and has a repeatable problem of timing out > > some drives before it can spin-up from standby (just the drive -- not the > > computer). The error logically removes the drive from the system until > > the next boot (unplugging, and replugging in the SATA cable on the drive > > would hang the machine within 5 seconds of replugging in the cable). Not > > an instant, hang as might indicated a HW upset plugging in cable, but a > > couple second delay after plugin -- before keyboard would lock up -- > > pointing toward the software trying to re-add+initialize the drive. > > Some promise controllers seem to suffer transmission problems when > combined with certain drives, which often show up as timeouts. The > hardreset of sata_promise wasn't as robust as it should have been and > in some cases it wasn't able to recover a link after error condition > causing the system to lose drive after such events. The hardreset > problem was fixed recently by Mikael Pettersson. Can you please try > 2.6.28-rc5 and see whether sata_promise still loses drives after > failures? > > Mikael, I think the hardreset fix is worthy including into -stable. > It should be safe for -stable too, right? The hardreset fix was included in 2.6.27.5. I wanted it in 2.6.26-stable too, but that branch seems to have been closed now. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 11:08 ` Mikael Pettersson @ 2008-11-16 14:24 ` Tejun Heo 2008-11-16 16:48 ` Brad Campbell 2008-11-16 17:34 ` Peter Favrholdt 2 siblings, 0 replies; 31+ messages in thread From: Tejun Heo @ 2008-11-16 14:24 UTC (permalink / raw) To: Mikael Pettersson Cc: Linda Walsh, LKML, Smartmontools Mailing List, linux-ide 2008-11-16 (일), 12:08 +0100, Mikael Pettersson wrote: > > Mikael, I think the hardreset fix is worthy including into -stable. > > It should be safe for -stable too, right? > > The hardreset fix was included in 2.6.27.5. I wanted it in 2.6.26-stable > too, but that branch seems to have been closed now. Ah, right, I was looking at 2.6.26.6 and thinking it was 2.6.27.6. :-P Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 11:08 ` Mikael Pettersson 2008-11-16 14:24 ` Tejun Heo @ 2008-11-16 16:48 ` Brad Campbell 2008-11-17 2:01 ` Tejun Heo 2008-11-16 17:34 ` Peter Favrholdt 2 siblings, 1 reply; 31+ messages in thread From: Brad Campbell @ 2008-11-16 16:48 UTC (permalink / raw) To: Mikael Pettersson Cc: Tejun Heo, Linda Walsh, LKML, Smartmontools Mailing List, linux-ide Mikael Pettersson wrote: > Tejun Heo writes: > > > and NOTE: the card or driver (or both) for the Promise 300 TX4 isn't > > > stable for production use -- and has a repeatable problem of timing out > > > some drives before it can spin-up from standby (just the drive -- not the > > > computer). The error logically removes the drive from the system until > > > the next boot (unplugging, and replugging in the SATA cable on the drive > > > would hang the machine within 5 seconds of replugging in the cable). Not > > > an instant, hang as might indicated a HW upset plugging in cable, but a > > > couple second delay after plugin -- before keyboard would lock up -- > > > pointing toward the software trying to re-add+initialize the drive. > > > > Some promise controllers seem to suffer transmission problems when > > combined with certain drives, which often show up as timeouts. The > > hardreset of sata_promise wasn't as robust as it should have been and > > in some cases it wasn't able to recover a link after error condition > > causing the system to lose drive after such events. The hardreset > > problem was fixed recently by Mikael Pettersson. Can you please try > > 2.6.28-rc5 and see whether sata_promise still loses drives after > > failures? > > > > Mikael, I think the hardreset fix is worthy including into -stable. > > It should be safe for -stable too, right? > > The hardreset fix was included in 2.6.27.5. I wanted it in 2.6.26-stable > too, but that branch seems to have been closed now. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > Is that likely to do anything for the old SATA150-TX4 ? I have 2 of them in a machine and I've been dropping drives under write load recently but it was a 2.6.27.4 kernel. Reboot required to pick up the drives again (unless the kernel panics and it reboots itself - which it's been doing also). Brad -- Dolphins are so intelligent that within a few weeks they can train Americans to stand at the edge of the pool and throw them fish. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 16:48 ` Brad Campbell @ 2008-11-17 2:01 ` Tejun Heo 0 siblings, 0 replies; 31+ messages in thread From: Tejun Heo @ 2008-11-17 2:01 UTC (permalink / raw) To: Brad Campbell Cc: Mikael Pettersson, Linda Walsh, LKML, Smartmontools Mailing List, linux-ide Brad Campbell wrote: >> The hardreset fix was included in 2.6.27.5. I wanted it in 2.6.26-stable >> too, but that branch seems to have been closed now. > > Is that likely to do anything for the old SATA150-TX4 ? > I have 2 of them in a machine and I've been dropping drives under write > load recently but it was a 2.6.27.4 kernel. > > Reboot required to pick up the drives again (unless the kernel panics > and it reboots itself - which it's been doing also). Does unloading and reloading sata_promise fix the problem? If your root is on promise, you'll need to try this from usb stick or live CD. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 11:08 ` Mikael Pettersson 2008-11-16 14:24 ` Tejun Heo 2008-11-16 16:48 ` Brad Campbell @ 2008-11-16 17:34 ` Peter Favrholdt 2008-11-16 17:39 ` Peter Favrholdt 2 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-16 17:34 UTC (permalink / raw) To: Mikael Pettersson, linux-ide Hi Michael and list, Mikael Pettersson wrote: > The hardreset fix was included in 2.6.27.5. I wanted it in > 2.6.26-stable too, but that branch seems to have been closed now. Having read about the hardreset fix I have just tried 2.6.27.5 on my setup (which works ok at 1.5Gbps but fails at 3.0Gbps). I'm still experiencing the same "link is slow to respond" problem using sata_promise in linux-2.6.27.5 with my Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) and 4 Seagate 500GB ES drives Model: ST3500630NS, Firmware: 3.AEE (with 1.5/3.0Gbps jumper removed = 3.0Gbps) After doing: dd if=/dev/sda of=/dev/null bs=1M & dd if=/dev/sdb of=/dev/null bs=1M & dd if=/dev/sdc of=/dev/null bs=1M & dd if=/dev/sdd of=/dev/null bs=1M & it runs fine for a while, then fails similar to my experiences with earlier kernels. Best regards, Peter [ 0.438775] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.438961] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.439392] 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.439630] 00:0b: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.439890] parport_pc 00:0c: reported by Plug and Play ACPI [ 0.439991] parport0: PC-style at 0x378 (0x778), irq 7, dma 3 [PCSPP,TRISTATE,COMPAT,ECP,DMA] [ 0.519897] Floppy drive(s): fd0 is 1.44M [ 0.543252] FDC 0 is a post-1991 82077 [ 0.545415] brd: module loaded [ 0.545648] ACPI: PCI Interrupt Link [APC2] enabled at IRQ 17 [ 0.545701] skge 0000:01:04.0: PCI INT A -> Link[APC2] -> GSI 17 (level, high) -> IRQ 17 [ 0.545799] skge 1.13 addr 0xe9020000 irq 17 chip Yukon-Lite rev 7 [ 0.546028] skge eth0: addr 00:11:2f:10:bc:bc [ 0.546133] forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61. [ 0.546466] ACPI: PCI Interrupt Link [APCH] enabled at IRQ 22 [ 0.546516] forcedeth 0000:00:04.0: PCI INT A -> Link[APCH] -> GSI 22 (level, high) -> IRQ 22 [ 0.546577] forcedeth 0000:00:04.0: setting latency timer to 64 [ 0.546607] nv_probe: set workaround bit for reversed mac addr [ 0.879571] Switched to high resolution mode on CPU 0 [ 1.060432] forcedeth 0000:00:04.0: ifname eth1, PHY OUI 0x732 @ 1, addr 00:11:2f:10:c1:c8 [ 1.060491] forcedeth 0000:00:04.0: timirq lnktim desc-v1 [ 1.060576] Uniform Multi-Platform E-IDE driver [ 1.060704] amd74xx 0000:00:09.0: UDMA133 controller [ 1.060752] amd74xx 0000:00:09.0: IDE controller (0x10de:0x0065 rev 0xa2) [ 1.060820] amd74xx 0000:00:09.0: BIOS didn't set cable bits correctly. Enabling workaround. [ 1.060879] amd74xx 0000:00:09.0: not 100% native mode: will probe irqs later [ 1.060932] ide0: BM-DMA at 0xf000-0xf007 [ 1.060982] ide1: BM-DMA at 0xf008-0xf00f [ 1.061030] Probing IDE interface ide0... [ 1.370125] hda: ST380021A, ATA DISK drive [ 1.670124] hdb: ST380021A, ATA DISK drive [ 1.730047] hda: host max PIO5 wanted PIO255(auto-tune) selected PIO4 [ 1.730233] hda: UDMA/100 mode selected [ 1.730351] hdb: host max PIO5 wanted PIO255(auto-tune) selected PIO4 [ 1.730481] hdb: UDMA/100 mode selected [ 1.730615] Probing IDE interface ide1... [ 2.570125] hdc: HITACHI DVD-ROM GD-2500, ATAPI CD/DVD-ROM drive [ 2.930042] hdc: host max PIO5 wanted PIO255(auto-tune) selected PIO4 [ 2.989009] hdc: MWDMA2 mode selected [ 2.989579] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 [ 2.989679] ide1 at 0x170-0x177,0x376 on irq 15 [ 2.989966] hda: max request size: 128KiB [ 2.990352] hda: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63 [ 2.990472] hda: cache flushes not supported [ 2.990553] hda: hda1 hda2 < hda5 hda6 hda7 hda8 > [ 3.049360] hdb: max request size: 128KiB [ 3.049857] hdb: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63 [ 3.049977] hdb: cache flushes not supported [ 3.050053] hdb: hdb1 hdb2 < hdb5 hdb6 hdb7 hdb8 hdb9 > [ 3.184964] hdc: ATAPI 24X DVD-ROM drive, 512kB Cache [ 3.185137] Uniform CD-ROM driver Revision: 3.20 [ 3.188154] Driver 'sd' needs updating - please use bus_type methods [ 3.188229] Driver 'sr' needs updating - please use bus_type methods [ 3.188347] sata_promise 0000:01:08.0: version 2.12 [ 3.188530] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18 [ 3.188582] sata_promise 0000:01:08.0: PCI INT A -> Link[APC3] -> GSI 18 (level, high) -> IRQ 18 [ 3.188822] scsi0 : sata_promise [ 3.188967] scsi1 : sata_promise [ 3.189074] scsi2 : sata_promise [ 3.189183] scsi3 : sata_promise [ 3.189283] ata1: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024380 irq 18 [ 3.189341] ata2: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024280 irq 18 [ 3.189398] ata3: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024200 irq 18 [ 3.189455] ata4: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024300 irq 18 [ 3.530038] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 3.581782] ata1.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [ 3.581829] ata1.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [ 3.673415] ata1.00: configured for UDMA/133 [ 4.020036] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 4.069958] ata2.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [ 4.070013] ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [ 4.161596] ata2.00: configured for UDMA/133 [ 4.510036] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 4.563919] ata3.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [ 4.563967] ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [ 4.647225] ata3.00: configured for UDMA/133 [ 4.990036] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 5.038870] ata4.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [ 5.038918] ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [ 5.122183] ata4.00: configured for UDMA/133 [ 5.122316] scsi 0:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [ 5.122502] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [ 5.122564] sd 0:0:0:0: [sda] Write Protect is off [ 5.122610] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 5.122631] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.122745] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [ 5.122803] sd 0:0:0:0: [sda] Write Protect is off [ 5.122850] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 5.122869] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.122929] sda: unknown partition table [ 5.137793] sd 0:0:0:0: [sda] Attached SCSI disk [ 5.137936] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 5.138035] scsi 1:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [ 5.138209] sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [ 5.138268] sd 1:0:0:0: [sdb] Write Protect is off [ 5.138315] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 5.138334] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.138433] sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [ 5.138492] sd 1:0:0:0: [sdb] Write Protect is off [ 5.139565] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 5.139585] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.139644] sdb: unknown partition table [ 5.151121] sd 1:0:0:0: [sdb] Attached SCSI disk [ 5.151240] sd 1:0:0:0: Attached scsi generic sg1 type 0 [ 5.151334] scsi 2:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [ 5.151502] sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) [ 5.151560] sd 2:0:0:0: [sdc] Write Protect is off [ 5.151606] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [ 5.151626] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.151725] sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) [ 5.151784] sd 2:0:0:0: [sdc] Write Protect is off [ 5.151830] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [ 5.151849] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.151908] sdc: unknown partition table [ 5.161924] sd 2:0:0:0: [sdc] Attached SCSI disk [ 5.162040] sd 2:0:0:0: Attached scsi generic sg2 type 0 [ 5.162134] scsi 3:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [ 5.162304] sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) [ 5.162363] sd 3:0:0:0: [sdd] Write Protect is off [ 5.162409] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 [ 5.162428] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.162526] sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) [ 5.162585] sd 3:0:0:0: [sdd] Write Protect is off [ 5.162631] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 [ 5.162651] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.162710] sdd: unknown partition table [ 5.178706] sd 3:0:0:0: [sdd] Attached SCSI disk [ 5.178826] sd 3:0:0:0: Attached scsi generic sg3 type 0 [ 5.179254] ACPI: PCI Interrupt Link [APCL] enabled at IRQ 21 [ 5.179309] ehci_hcd 0000:00:02.2: PCI INT C -> Link[APCL] -> GSI 21 (level, high) -> IRQ 21 [ 5.179377] ehci_hcd 0000:00:02.2: setting latency timer to 64 [ 5.179381] ehci_hcd 0000:00:02.2: EHCI Host Controller [ 5.179489] ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1 [ 5.179586] ehci_hcd 0000:00:02.2: debug port 1 [ 5.179635] ehci_hcd 0000:00:02.2: cache line size of 64 is not supported [ 5.179649] ehci_hcd 0000:00:02.2: irq 21, io mem 0xea083000 [ 5.190005] ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 [ 5.190166] usb usb1: configuration #1 chosen from 1 choice [ 5.190262] hub 1-0:1.0: USB hub found [ 5.190317] hub 1-0:1.0: 6 ports detected [ 5.410239] ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver [ 5.410524] ACPI: PCI Interrupt Link [APCF] enabled at IRQ 20 [ 5.410575] ohci_hcd 0000:00:02.0: PCI INT A -> Link[APCF] -> GSI 20 (level, high) -> IRQ 20 [ 5.410639] ohci_hcd 0000:00:02.0: setting latency timer to 64 [ 5.410642] ohci_hcd 0000:00:02.0: OHCI Host Controller [ 5.410734] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2 [ 5.410805] ohci_hcd 0000:00:02.0: irq 20, io mem 0xea087000 [ 5.472121] usb usb2: configuration #1 chosen from 1 choice [ 5.472216] hub 2-0:1.0: USB hub found [ 5.472271] hub 2-0:1.0: 3 ports detected [ 5.690426] ACPI: PCI Interrupt Link [APCG] enabled at IRQ 22 [ 5.690476] ohci_hcd 0000:00:02.1: PCI INT B -> Link[APCG] -> GSI 22 (level, high) -> IRQ 22 [ 5.690538] ohci_hcd 0000:00:02.1: setting latency timer to 64 [ 5.690541] ohci_hcd 0000:00:02.1: OHCI Host Controller [ 5.690632] ohci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 3 [ 5.690702] ohci_hcd 0000:00:02.1: irq 22, io mem 0xea082000 [ 5.752085] usb usb3: configuration #1 chosen from 1 choice [ 5.752176] hub 3-0:1.0: USB hub found [ 5.752230] hub 3-0:1.0: 3 ports detected [ 5.830005] usb 2-1: new low speed USB device using ohci_hcd and address 2 [ 5.860305] PNP: No PS/2 controller found. Probing ports directly. [ 6.020032] usb 2-1: configuration #1 chosen from 1 choice [ 6.110391] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 6.110568] mice: PS/2 mouse device common for all mice [ 6.110650] i2c /dev entries driver [ 6.110859] i2c-adapter i2c-0: nForce2 SMBus adapter at 0x5000 [ 6.110978] i2c-adapter i2c-1: nForce2 SMBus adapter at 0x5500 [ 6.160006] usb 2-2: new low speed USB device using ohci_hcd and address 3 [ 6.348997] usb 2-2: configuration #1 chosen from 1 choice [ 6.900054] raid6: int32x1 735 MB/s [ 7.070018] raid6: int32x2 828 MB/s [ 7.240105] raid6: int32x4 602 MB/s [ 7.410002] raid6: int32x8 605 MB/s [ 7.580007] raid6: mmxx1 1525 MB/s [ 7.750031] raid6: mmxx2 1668 MB/s [ 7.920043] raid6: sse1x1 1423 MB/s [ 8.090031] raid6: sse1x2 2186 MB/s [ 8.090075] raid6: using algorithm sse1x2 (2186 MB/s) [ 8.090122] md: raid6 personality registered for level 6 [ 8.090169] md: raid5 personality registered for level 5 [ 8.090215] md: raid4 personality registered for level 4 [ 8.090348] device-mapper: ioctl: 4.14.0-ioctl (2008-04-23) initialised: dm-devel@redhat.com [ 8.090488] TCP cubic registered [ 8.090543] Using IPI Shortcut mode [ 8.091079] md: Autodetecting RAID arrays. [ 8.091125] md: Scanned 0 and added 0 devices. [ 8.091170] md: autorun ... [ 8.091213] md: ... autorun DONE. [ 8.108113] kjournald starting. Commit interval 5 seconds [ 8.108167] EXT3-fs: mounted filesystem with ordered data mode. [ 8.108251] VFS: Mounted root (ext3 filesystem) readonly. [ 8.108546] Freeing unused kernel memory: 208k freed [ 9.326077] NET: Registered protocol family 1 [ 11.645880] hdc: task_no_data_intr: status=0x51 { DriveReady SeekComplete Error } [ 11.646070] hdc: task_no_data_intr: error=0x04 { AbortedCommand } [ 11.646211] ide: failed opcode was: 0xe3 [ 12.171261] usbcore: registered new interface driver hiddev [ 12.177807] input: Logitech USB Receiver as /class/input/input2 [ 12.198052] input: PC Speaker as /class/input/input3 [ 12.220189] input: USB HID v1.10 Keyboard [Logitech USB Receiver] on usb-0000:00:02.0-1 [ 12.231204] input: Logitech USB Receiver as /class/input/input4 [ 12.273150] input: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-0000:00:02.0-1 [ 12.278735] input: Logitech USB-PS/2 Trackball as /class/input/input5 [ 12.279083] input: USB HID v1.00 Mouse [Logitech USB-PS/2 Trackball] on usb-0000:00:02.0-2 [ 12.279218] usbcore: registered new interface driver usbhid [ 12.279267] usbhid: v2.6:USB HID core driver [ 12.433286] ACPI: PCI Interrupt Link [APCM] enabled at IRQ 21 [ 12.433342] ohci1394 0000:00:0d.0: PCI INT A -> Link[APCM] -> GSI 21 (level, high) -> IRQ 21 [ 12.433403] ohci1394 0000:00:0d.0: setting latency timer to 64 [ 12.633606] ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[21] MMIO=[ea084000-ea0847ff] Max Packet=[2048] IR/IT contexts=[4/4] [ 12.753615] ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 20 [ 12.753672] Intel ICH 0000:00:06.0: PCI INT A -> Link[APCJ] -> GSI 20 (level, high) -> IRQ 20 [ 12.753751] Intel ICH 0000:00:06.0: setting latency timer to 64 [ 13.090016] intel8x0_measure_ac97_clock: measured 55817 usecs [ 13.090069] intel8x0: clocking to 48000 [ 13.760518] ohci1394: fw-host0: SelfID received outside of bus reset sequence [ 14.081466] ieee1394: Host added: ID:BUS[0-00:1023] GUID[00e01800007b0179] [ 14.291280] hdc: task_no_data_intr: status=0x51 { DriveReady SeekComplete Error } [ 14.291469] hdc: task_no_data_intr: error=0x04 { AbortedCommand } [ 14.291610] ide: failed opcode was: 0xe3 [ 15.215173] Adding 979924k swap on /dev/hda7. Priority:-1 extents:1 across:979924k [ 15.479143] EXT3 FS on hda8, internal journal [ 16.369094] md: md0 stopped. [ 16.452154] md: bind<sdd> [ 16.452360] md: bind<sdc> [ 16.452538] md: bind<sdb> [ 16.452740] md: bind<sda> [ 16.461880] raid5: device sda operational as raid disk 0 [ 16.461930] raid5: device sdb operational as raid disk 3 [ 16.461977] raid5: device sdc operational as raid disk 2 [ 16.462024] raid5: device sdd operational as raid disk 1 [ 16.462527] raid5: allocated 4201kB for md0 [ 16.462573] raid5: raid level 5 set md0 active with 4 out of 4 devices, algorithm 2 [ 16.462628] RAID5 conf printout: [ 16.462671] --- rd:4 wd:4 [ 16.462714] disk 0, o:1, dev:sda [ 16.462757] disk 1, o:1, dev:sdd [ 16.462801] disk 2, o:1, dev:sdc [ 16.462845] disk 3, o:1, dev:sdb [ 27.908026] ReiserFS: hda6: found reiserfs format "3.6" with standard journal [ 27.908096] ReiserFS: hda6: using ordered data mode [ 27.913232] ReiserFS: hda6: journal params: device hda6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 [ 27.914339] ReiserFS: hda6: checking transaction log (hda6) [ 27.963925] ReiserFS: hda6: Using r5 hash to sort names [ 29.028911] skge eth0: enabling interface [ 31.373805] skge eth0: Link is up at 1000 Mbps, full duplex, flow control both [ 36.881521] lp0: using parport0 (interrupt-driven). [ 36.900445] ppdev: user-space parallel port driver [ 47.206417] warning: `ntpd' uses 32-bit capabilities (legacy support in use) [ 82.450294] ondemand governor failed, too long transition latency of HW, fallback to performance governor [ 656.340149] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x6 frozen [ 656.340159] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } [ 656.340168] ata1.00: cmd c8/00:00:18:12:e6/00:00:00:00:00/e1 tag 0 dma 131072 in [ 656.340169] res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [ 656.340173] ata1.00: status: { DRDY } [ 656.340217] ata1: hard resetting link [ 661.730034] ata1: link is slow to respond, please be patient (ready=-19) [ 666.350033] ata1: COMRESET failed (errno=-16) [ 666.350064] ata1: hard resetting link [ 671.740095] ata1: link is slow to respond, please be patient (ready=-19) [ 676.361213] ata1: COMRESET failed (errno=-16) [ 676.361265] ata1: hard resetting link [ 681.750032] ata1: link is slow to respond, please be patient (ready=-19) [ 711.390029] ata1: COMRESET failed (errno=-16) [ 711.390042] ata1: limiting SATA link speed to 1.5 Gbps [ 711.390088] ata1: hard resetting link [ 716.430157] ata1: COMRESET failed (errno=-16) [ 716.430166] ata1: reset failed, giving up [ 716.430170] ata1.00: disabled [ 716.430185] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t4 [ 716.430188] ata1: hotplug_status 0x80 [ 716.430245] ata1: hard resetting link [ 722.220031] ata1: link is slow to respond, please be patient (ready=-19) [ 726.490037] ata1: COMRESET failed (errno=-16) [ 726.490095] ata1: hard resetting link [ 732.290034] ata1: link is slow to respond, please be patient (ready=-19) [ 736.553291] ata1: COMRESET failed (errno=-16) [ 736.553329] ata1: hard resetting link [ 742.340109] ata1: link is slow to respond, please be patient (ready=-19) [ 771.580032] ata1: COMRESET failed (errno=-16) [ 771.580043] ata1: limiting SATA link speed to 1.5 Gbps [ 771.580091] ata1: hard resetting link [ 776.590064] ata1: COMRESET failed (errno=-16) [ 776.590072] ata1: reset failed, giving up [ 776.590086] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t3 [ 776.590088] ata1: hotplug_status 0x80 [ 776.590147] ata1: hard resetting link [ 782.390031] ata1: link is slow to respond, please be patient (ready=-19) [ 786.650038] ata1: COMRESET failed (errno=-16) [ 786.650090] ata1: hard resetting link [ 792.450034] ata1: link is slow to respond, please be patient (ready=-19) [ 796.710042] ata1: COMRESET failed (errno=-16) [ 796.710086] ata1: hard resetting link [ 802.510034] ata1: link is slow to respond, please be patient (ready=-19) [ 831.730046] ata1: COMRESET failed (errno=-16) [ 831.730056] ata1: limiting SATA link speed to 1.5 Gbps [ 831.730096] ata1: hard resetting link [ 836.750045] ata1: COMRESET failed (errno=-16) [ 836.750054] ata1: reset failed, giving up [ 836.750066] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t2 [ 836.750069] ata1: hotplug_status 0x80 [ 836.750114] ata1: hard resetting link [ 842.550037] ata1: link is slow to respond, please be patient (ready=-19) [ 846.810033] ata1: COMRESET failed (errno=-16) [ 846.810070] ata1: hard resetting link [ 852.610037] ata1: link is slow to respond, please be patient (ready=-19) [ 856.871987] ata1: COMRESET failed (errno=-16) [ 856.872031] ata1: hard resetting link [ 862.670042] ata1: link is slow to respond, please be patient (ready=-19) [ 891.890038] ata1: COMRESET failed (errno=-16) [ 891.890049] ata1: limiting SATA link speed to 1.5 Gbps [ 891.890098] ata1: hard resetting link [ 896.900036] ata1: COMRESET failed (errno=-16) [ 896.900045] ata1: reset failed, giving up [ 896.900065] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t1 [ 896.900067] ata1: hotplug_status 0x80 [ 896.900126] ata1: hard resetting link [ 902.690520] ata1: link is slow to respond, please be patient (ready=-19) [ 906.950040] ata1: COMRESET failed (errno=-16) [ 906.950097] ata1: hard resetting link [ 912.750088] ata1: link is slow to respond, please be patient (ready=-19) [ 917.010039] ata1: COMRESET failed (errno=-16) [ 917.010089] ata1: hard resetting link [ 922.810085] ata1: link is slow to respond, please be patient (ready=-19) [ 952.030039] ata1: COMRESET failed (errno=-16) [ 952.030051] ata1: limiting SATA link speed to 1.5 Gbps [ 952.030084] ata1: hard resetting link [ 957.040071] ata1: COMRESET failed (errno=-16) [ 957.040080] ata1: reset failed, giving up [ 957.040087] ata1: EH pending after 5 tries, giving up [ 957.040102] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 [ 957.040106] sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] [ 957.040110] Descriptor sense data with sense descriptors (in hex): [ 957.040113] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 957.040119] 00 00 00 09 [ 957.040122] sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 [ 957.040126] end_request: I/O error, dev sda, sector 31855128 [ 957.040130] Buffer I/O error on device sda, logical block 3981891 [ 957.040135] Buffer I/O error on device sda, logical block 3981892 [ 957.040138] Buffer I/O error on device sda, logical block 3981893 [ 957.040142] Buffer I/O error on device sda, logical block 3981894 [ 957.040145] Buffer I/O error on device sda, logical block 3981895 [ 957.040148] Buffer I/O error on device sda, logical block 3981896 [ 957.040151] Buffer I/O error on device sda, logical block 3981897 [ 957.040154] Buffer I/O error on device sda, logical block 3981898 [ 957.040157] Buffer I/O error on device sda, logical block 3981899 [ 957.040160] Buffer I/O error on device sda, logical block 3981900 [ 957.040190] sd 0:0:0:0: rejecting I/O to offline device [ 957.040196] ata1: EH complete [ 957.040205] ata1.00: detaching (SCSI 0:0:0:0) [ 957.040294] sd 0:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00 [ 957.040298] end_request: I/O error, dev sda, sector 31855384 [ 957.040513] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 957.042102] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [ 957.042108] sd 0:0:0:0: [sda] Stopping disk [ 957.042115] sd 0:0:0:0: [sda] START_STOP FAILED [ 957.042117] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 17:34 ` Peter Favrholdt @ 2008-11-16 17:39 ` Peter Favrholdt 2008-11-17 2:01 ` Tejun Heo 0 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-16 17:39 UTC (permalink / raw) To: Mikael Pettersson, linux-ide Replying to my own post: It was 2.6.27.6 not 2.6.27.5! Peter Favrholdt wrote: > Having read about the hardreset fix I have just tried 2.6.27.5 on my > setup (which works ok at 1.5Gbps but fails at 3.0Gbps). Best regards, Peter ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-16 17:39 ` Peter Favrholdt @ 2008-11-17 2:01 ` Tejun Heo 2008-11-17 11:47 ` Peter Favrholdt 0 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2008-11-17 2:01 UTC (permalink / raw) To: Peter Favrholdt; +Cc: Mikael Pettersson, linux-ide Peter Favrholdt wrote: > Replying to my own post: > > It was 2.6.27.6 not 2.6.27.5! > > Peter Favrholdt wrote: >> Having read about the hardreset fix I have just tried 2.6.27.5 on my >> setup (which works ok at 1.5Gbps but fails at 3.0Gbps). Does unloading and reloading sata_promise fix the problem? -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-17 2:01 ` Tejun Heo @ 2008-11-17 11:47 ` Peter Favrholdt 2008-11-18 1:11 ` Tejun Heo 0 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-17 11:47 UTC (permalink / raw) To: Tejun Heo, linux-ide Hi Tejun, Thanks for your reply. Tejun Heo wrote: > Does unloading and reloading sata_promise fix the problem? I had sata_promise linked into the kernel so I had to recompile and reboot. The reboot fixed it (didn't turn power off - just reboot). I then tried with sata_promise as a module. Four dd's reading the drives and shortly same problem: [ 700.520101] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x6 frozen [ 700.520112] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } [ 700.520120] ata1.00: cmd c8/00:00:90:0e:e6/00:00:00:00:00/e1 tag 0 dma 131072 in [ 700.520122] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [ 700.520126] ata1.00: status: { DRDY } [ 700.520169] ata1: hard resetting link [ 705.910331] ata1: link is slow to respond, please be patient (ready=-19) [ 710.531474] ata1: COMRESET failed (errno=-16) [ 710.531531] ata1: hard resetting link [ 715.920526] ata1: link is slow to respond, please be patient (ready=-19) I tried removing sata_promise the nice way using modprobe: # modprobe -r -f sata_promise FATAL: Module sata_promise is in use. then the not-so-nice way using rmmod -f and inserting it again. It doesn't work. Here are the relevant parts from dmesg (removing and inserting): [ 1001.080120] Buffer I/O error on device sda, logical block 3981785 [ 1001.080123] Buffer I/O error on device sda, logical block 3981786 [ 1001.080126] Buffer I/O error on device sda, logical block 3981787 [ 1001.080148] sd 0:0:0:0: rejecting I/O to offline device [ 1001.080155] ata1: EH complete [ 1001.080165] ata1.00: detaching (SCSI 0:0:0:0) [ 1001.080268] sd 0:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00 [ 1001.080272] end_request: I/O error, dev sda, sector 31854480 [ 1001.080484] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 1001.084116] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [ 1001.084124] sd 0:0:0:0: [sda] Stopping disk [ 1001.084131] sd 0:0:0:0: [sda] START_STOP FAILED [ 1001.084133] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [ 1857.939501] ata2.00: disabled [ 1857.974792] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [ 1857.974923] sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [ 1857.974928] sd 1:0:0:0: [sdb] Stopping disk [ 1857.974971] sd 1:0:0:0: [sdb] START_STOP FAILED [ 1857.974974] sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [ 1857.975067] ata3.00: disabled [ 1858.021338] sd 2:0:0:0: [sdc] Synchronizing SCSI cache [ 1858.021569] sd 2:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [ 1858.021574] sd 2:0:0:0: [sdc] Stopping disk [ 1858.021744] sd 2:0:0:0: [sdc] START_STOP FAILED [ 1858.021747] sd 2:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [ 1858.021967] ata4.00: disabled [ 1858.075341] sd 3:0:0:0: [sdd] Synchronizing SCSI cache [ 1858.075583] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 [ 1858.075588] sd 3:0:0:0: [sdd] Stopping disk [ 1858.075760] sd 3:0:0:0: [sdd] START_STOP FAILED [ 1858.075763] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 [ 1858.075914] sata_promise 0000:01:08.0: PCI INT A disabled [ 1889.910087] sata_promise 0000:01:08.0: version 2.12 [ 1889.910336] sata_promise 0000:01:08.0: PCI INT A -> Link[APC3] -> GSI 18 (level, high) -> IRQ 18 [ 1889.910736] scsi4 : sata_promise [ 1889.912425] scsi5 : sata_promise [ 1889.912582] scsi6 : sata_promise [ 1889.913039] scsi7 : sata_promise [ 1889.913107] ata5: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024380 irq 18 [ 1889.913114] ata6: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024280 irq 18 [ 1889.913119] ata7: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024200 irq 18 [ 1889.913123] ata8: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024300 irq 18 [ 1895.300008] ata5: link is slow to respond, please be patient (ready=-19) [ 1899.920484] ata5: COMRESET failed (errno=-16) [ 1905.310009] ata5: link is slow to respond, please be patient (ready=-19) [ 1909.930012] ata5: COMRESET failed (errno=-16) [ 1915.320008] ata5: link is slow to respond, please be patient (ready=-19) [ 1944.960008] ata5: COMRESET failed (errno=-16) [ 1944.960017] ata5: limiting SATA link speed to 1.5 Gbps [ 1949.990008] ata5: COMRESET failed (errno=-16) [ 1949.990015] ata5: reset failed, giving up [ 1950.340041] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1955.340026] ata6.00: qc timeout (cmd 0xec) [ 1955.340240] ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 1960.730007] ata6: link is slow to respond, please be patient (ready=-19) [ 1965.350009] ata6: COMRESET failed (errno=-16) [ 1970.740007] ata6: link is slow to respond, please be patient (ready=-19) [ 1975.360010] ata6: COMRESET failed (errno=-16) [ 1980.750007] ata6: link is slow to respond, please be patient (ready=-19) [ 2010.392068] ata6: COMRESET failed (errno=-16) [ 2010.392078] ata6: limiting SATA link speed to 1.5 Gbps [ 2015.420013] ata6: COMRESET failed (errno=-16) [ 2015.420021] ata6: reset failed, giving up [ 2015.420042] ata6: hard resetting link [ 2020.810007] ata6: link is slow to respond, please be patient (ready=-19) [ 2025.430011] ata6: COMRESET failed (errno=-16) [ 2025.430026] ata6: hard resetting link [ 2030.820007] ata6: link is slow to respond, please be patient (ready=-19) [ 2035.440011] ata6: COMRESET failed (errno=-16) [ 2035.440025] ata6: hard resetting link [ 2040.830007] ata6: link is slow to respond, please be patient (ready=-19) [ 2070.470007] ata6: COMRESET failed (errno=-16) [ 2070.470023] ata6: hard resetting link [ 2075.500008] ata6: COMRESET failed (errno=-16) [ 2075.500015] ata6: reset failed, giving up [ 2075.500021] ata6: EH complete [ 2075.850041] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 2080.850027] ata7.00: qc timeout (cmd 0xec) [ 2080.850241] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 2086.240010] ata7: link is slow to respond, please be patient (ready=-19) [ 2090.860010] ata7: COMRESET failed (errno=-16) [ 2096.250008] ata7: link is slow to respond, please be patient (ready=-19) I tried it one more time with the same result. Best regards, Peter ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-17 11:47 ` Peter Favrholdt @ 2008-11-18 1:11 ` Tejun Heo 2008-11-18 18:03 ` Peter Favrholdt 0 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2008-11-18 1:11 UTC (permalink / raw) To: Peter Favrholdt; +Cc: linux-ide Peter Favrholdt wrote: > Hi Tejun, > > Thanks for your reply. > > Tejun Heo wrote: >> Does unloading and reloading sata_promise fix the problem? > > I had sata_promise linked into the kernel so I had to recompile and > reboot. The reboot fixed it (didn't turn power off - just reboot). > > I then tried with sata_promise as a module. Four dd's reading the drives > and shortly same problem: > > [ 700.520101] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 > action 0x6 frozen > [ 700.520112] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } > [ 700.520120] ata1.00: cmd c8/00:00:90:0e:e6/00:00:00:00:00/e1 tag 0 > dma 131072 in > [ 700.520122] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask > 0x4 (timeout) > [ 700.520126] ata1.00: status: { DRDY } > [ 700.520169] ata1: hard resetting link > [ 705.910331] ata1: link is slow to respond, please be patient (ready=-19) > [ 710.531474] ata1: COMRESET failed (errno=-16) > [ 710.531531] ata1: hard resetting link > [ 715.920526] ata1: link is slow to respond, please be patient (ready=-19) > > I tried removing sata_promise the nice way using modprobe: > > # modprobe -r -f sata_promise > FATAL: Module sata_promise is in use. > > then the not-so-nice way using rmmod -f and inserting it again. It > doesn't work. Here are the relevant parts from dmesg (removing and > inserting): Oh... you first need to wait for the kernel to be done with the devices. ie. unmount all filesystems living on those && wait till EH finishes. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-18 1:11 ` Tejun Heo @ 2008-11-18 18:03 ` Peter Favrholdt 2008-11-19 1:55 ` Tejun Heo 0 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-18 18:03 UTC (permalink / raw) To: linux-ide; +Cc: Tejun Heo Hi Tejun & list Tejun Heo wrote: > Peter Favrholdt wrote: >> I tried removing sata_promise the nice way using modprobe: >> >> # modprobe -r -f sata_promise >> FATAL: Module sata_promise is in use. >> >> then the not-so-nice way using rmmod -f and inserting it again. It >> doesn't work. Here are the relevant parts from dmesg (removing and >> inserting): > > Oh... you first need to wait for the kernel to be done with the devices. > ie. unmount all filesystems living on those && wait till EH finishes. I didn't have any filesystems mounted but learned that the 4 disks participate in a md raid5 set. To be sure I tried everything again from the beginning (reboot) without enabling md on the four drives. Here are the results: 1. started the four dd processes which ran to completion without any errors. Started them again and this time sda failed as usual. 2. modprobe -r sata_promise - no warnings/errors: removed successfully 3. modprobe sata_promise - only three drives show up 4. modprobe -r sata_promise - no warnings/errors: removed successfully 5. modprobe sata_promise - more channels complain - no drives are found dmesg: [21626.490086] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x6 frozen [21626.490096] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } [21626.490105] ata1.00: cmd 25/00:00:00:1f:e6/00:02:01:00:00/e0 tag 0 dma 262144 in [21626.490107] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [21626.490110] ata1.00: status: { DRDY } [21626.490162] ata1: hard resetting link [21631.881055] ata1: link is slow to respond, please be patient (ready=-19) [21636.500030] ata1: COMRESET failed (errno=-16) [21636.500068] ata1: hard resetting link [21641.890034] ata1: link is slow to respond, please be patient (ready=-19) [21646.510903] ata1: COMRESET failed (errno=-16) [21646.510919] ata1: hard resetting link [21651.900107] ata1: link is slow to respond, please be patient (ready=-19) [21681.540037] ata1: COMRESET failed (errno=-16) [21681.540047] ata1: limiting SATA link speed to 1.5 Gbps [21681.540074] ata1: hard resetting link [21686.570034] ata1: COMRESET failed (errno=-16) [21686.570043] ata1: reset failed, giving up [21686.570047] ata1.00: disabled [21686.570070] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t4 [21686.570073] ata1: hotplug_status 0x80 [21686.570124] ata1: hard resetting link [21692.360323] ata1: link is slow to respond, please be patient (ready=-19) [21696.620070] ata1: COMRESET failed (errno=-16) [21696.620099] ata1: hard resetting link [21702.410066] ata1: link is slow to respond, please be patient (ready=-19) [21706.670033] ata1: COMRESET failed (errno=-16) [21706.670084] ata1: hard resetting link [21712.460035] ata1: link is slow to respond, please be patient (ready=-19) [21741.680037] ata1: COMRESET failed (errno=-16) [21741.680054] ata1: limiting SATA link speed to 1.5 Gbps [21741.680089] ata1: hard resetting link [21746.690034] ata1: COMRESET failed (errno=-16) [21746.690042] ata1: reset failed, giving up [21746.690055] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t3 [21746.690057] ata1: hotplug_status 0x80 [21746.690116] ata1: hard resetting link [21752.480034] ata1: link is slow to respond, please be patient (ready=-19) [21756.740031] ata1: COMRESET failed (errno=-16) [21756.740076] ata1: hard resetting link [21762.530038] ata1: link is slow to respond, please be patient (ready=-19) [21766.792415] ata1: COMRESET failed (errno=-16) [21766.792454] ata1: hard resetting link [21772.580049] ata1: link is slow to respond, please be patient (ready=-19) [21801.800038] ata1: COMRESET failed (errno=-16) [21801.800221] ata1: limiting SATA link speed to 1.5 Gbps [21801.800297] ata1: hard resetting link [21806.810027] ata1: COMRESET failed (errno=-16) [21806.810036] ata1: reset failed, giving up [21806.810050] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t2 [21806.810052] ata1: hotplug_status 0x80 [21806.810110] ata1: hard resetting link [21812.600033] ata1: link is slow to respond, please be patient (ready=-19) [21816.860033] ata1: COMRESET failed (errno=-16) [21816.860085] ata1: hard resetting link [21822.650036] ata1: link is slow to respond, please be patient (ready=-19) [21826.910107] ata1: COMRESET failed (errno=-16) [21826.910158] ata1: hard resetting link [21832.700074] ata1: link is slow to respond, please be patient (ready=-19) [21861.920041] ata1: COMRESET failed (errno=-16) [21861.920058] ata1: limiting SATA link speed to 1.5 Gbps [21861.920106] ata1: hard resetting link [21866.940034] ata1: COMRESET failed (errno=-16) [21866.940043] ata1: reset failed, giving up [21866.940056] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t1 [21866.940059] ata1: hotplug_status 0x80 [21866.940118] ata1: hard resetting link [21872.730031] ata1: link is slow to respond, please be patient (ready=-19) [21877.000085] ata1: COMRESET failed (errno=-16) [21877.000117] ata1: hard resetting link [21882.793307] ata1: link is slow to respond, please be patient (ready=-19) [21887.050045] ata1: COMRESET failed (errno=-16) [21887.050088] ata1: hard resetting link [21892.840068] ata1: link is slow to respond, please be patient (ready=-19) [21922.060035] ata1: COMRESET failed (errno=-16) [21922.060046] ata1: limiting SATA link speed to 1.5 Gbps [21922.060073] ata1: hard resetting link [21927.070032] ata1: COMRESET failed (errno=-16) [21927.070040] ata1: reset failed, giving up [21927.070047] ata1: EH pending after 5 tries, giving up [21927.070069] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 [21927.070073] sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] [21927.070078] Descriptor sense data with sense descriptors (in hex): [21927.070080] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [21927.070088] 00 00 00 00 [21927.070091] sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 [21927.070094] end_request: I/O error, dev sda, sector 31858432 [21927.070099] Buffer I/O error on device sda, logical block 3982304 [21927.070104] Buffer I/O error on device sda, logical block 3982305 [21927.070107] Buffer I/O error on device sda, logical block 3982306 [21927.070110] Buffer I/O error on device sda, logical block 3982307 [21927.070114] Buffer I/O error on device sda, logical block 3982308 [21927.070117] Buffer I/O error on device sda, logical block 3982309 [21927.070120] Buffer I/O error on device sda, logical block 3982310 [21927.070124] Buffer I/O error on device sda, logical block 3982311 [21927.070127] Buffer I/O error on device sda, logical block 3982312 [21927.070131] Buffer I/O error on device sda, logical block 3982313 [21927.070177] ata1: EH complete [21927.070186] ata1.00: detaching (SCSI 0:0:0:0) [21927.071140] sd 0:0:0:0: [sda] Synchronizing SCSI cache [21927.072425] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [21927.072431] sd 0:0:0:0: [sda] Stopping disk [21927.072439] sd 0:0:0:0: [sda] START_STOP FAILED [21927.072441] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [24533.844877] ata2.00: disabled [24533.845094] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [24533.845129] sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [24533.845133] sd 1:0:0:0: [sdb] Stopping disk [24533.845140] sd 1:0:0:0: [sdb] START_STOP FAILED [24533.845142] sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [24533.845402] ata3.00: disabled [24533.845544] sd 2:0:0:0: [sdc] Synchronizing SCSI cache [24533.845567] sd 2:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [24533.845571] sd 2:0:0:0: [sdc] Stopping disk [24533.845578] sd 2:0:0:0: [sdc] START_STOP FAILED [24533.845580] sd 2:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [24533.845693] ata4.00: disabled [24533.845833] sd 3:0:0:0: [sdd] Synchronizing SCSI cache [24533.845856] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 [24533.845859] sd 3:0:0:0: [sdd] Stopping disk [24533.845866] sd 3:0:0:0: [sdd] START_STOP FAILED [24533.845868] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 [24533.846012] sata_promise 0000:01:08.0: PCI INT A disabled [24551.960098] sata_promise 0000:01:08.0: version 2.12 [24551.960353] sata_promise 0000:01:08.0: PCI INT A -> Link[APC3] -> GSI 18 (level, high) -> IRQ 18 [24551.961961] scsi4 : sata_promise [24551.970471] scsi5 : sata_promise [24551.970683] scsi6 : sata_promise [24551.972392] scsi7 : sata_promise [24551.972633] ata5: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024380 irq 18 [24551.972640] ata6: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024280 irq 18 [24551.972645] ata7: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024200 irq 18 [24551.972649] ata8: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024300 irq 18 [24557.360017] ata5: link is slow to respond, please be patient (ready=-19) [24561.980015] ata5: COMRESET failed (errno=-16) [24567.370014] ata5: link is slow to respond, please be patient (ready=-19) [24571.990013] ata5: COMRESET failed (errno=-16) [24577.380015] ata5: link is slow to respond, please be patient (ready=-19) [24607.020010] ata5: COMRESET failed (errno=-16) [24607.020019] ata5: limiting SATA link speed to 1.5 Gbps [24612.050010] ata5: COMRESET failed (errno=-16) [24612.050017] ata5: reset failed, giving up [24612.400045] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [24612.459926] ata6.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [24612.459930] ata6.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [24612.551577] ata6.00: configured for UDMA/133 [24612.900042] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [24612.961700] ata7.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [24612.961707] ata7.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [24613.053320] ata7.00: configured for UDMA/133 [24613.400045] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [24613.461067] ata8.00: ATA-7: ST3500630NS, 3.AEE, max UDMA/133 [24613.461072] ata8.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) [24613.552710] ata8.00: configured for UDMA/133 [24613.553014] scsi 5:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [24613.559489] sd 5:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [24613.559589] sd 5:0:0:0: [sda] Write Protect is off [24613.559594] sd 5:0:0:0: [sda] Mode Sense: 00 3a 00 00 [24613.559741] sd 5:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [24613.559901] sd 5:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [24613.560029] sd 5:0:0:0: [sda] Write Protect is off [24613.560033] sd 5:0:0:0: [sda] Mode Sense: 00 3a 00 00 [24613.560104] sd 5:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [24613.560109] sda: unknown partition table [24613.575844] sd 5:0:0:0: [sda] Attached SCSI disk [24613.576017] sd 5:0:0:0: Attached scsi generic sg0 type 0 [24613.576147] scsi 6:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [24613.576267] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [24613.576308] sd 6:0:0:0: [sdb] Write Protect is off [24613.576312] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [24613.576366] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [24613.576460] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) [24613.576495] sd 6:0:0:0: [sdb] Write Protect is off [24613.576498] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [24613.576520] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [24613.576524] sdb: unknown partition table [24613.602070] sd 6:0:0:0: [sdb] Attached SCSI disk [24613.602163] sd 6:0:0:0: Attached scsi generic sg1 type 0 [24613.602252] scsi 7:0:0:0: Direct-Access ATA ST3500630NS 3.AE PQ: 0 ANSI: 5 [24613.602343] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) [24613.602360] sd 7:0:0:0: [sdc] Write Protect is off [24613.602363] sd 7:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [24613.602386] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [24613.602457] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) [24613.602470] sd 7:0:0:0: [sdc] Write Protect is off [24613.602473] sd 7:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [24613.602495] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [24613.602498] sdc: unknown partition table [24613.617764] sd 7:0:0:0: [sdc] Attached SCSI disk [24613.617936] sd 7:0:0:0: Attached scsi generic sg2 type 0 [28434.310751] ata6.00: disabled [28434.311019] sd 5:0:0:0: [sda] Synchronizing SCSI cache [28434.312760] sd 5:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [28434.312769] sd 5:0:0:0: [sda] Stopping disk [28434.313018] sd 5:0:0:0: [sda] START_STOP FAILED [28434.313022] sd 5:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [28434.313242] ata7.00: disabled [28434.313514] sd 6:0:0:0: [sdb] Synchronizing SCSI cache [28434.313680] sd 6:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [28434.313684] sd 6:0:0:0: [sdb] Stopping disk [28434.313806] sd 6:0:0:0: [sdb] START_STOP FAILED [28434.313809] sd 6:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [28434.313981] ata8.00: disabled [28434.314179] sd 7:0:0:0: [sdc] Synchronizing SCSI cache [28434.314315] sd 7:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [28434.314320] sd 7:0:0:0: [sdc] Stopping disk [28434.314440] sd 7:0:0:0: [sdc] START_STOP FAILED [28434.314444] sd 7:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [28434.314714] sata_promise 0000:01:08.0: PCI INT A disabled [28498.823936] sata_promise 0000:01:08.0: version 2.12 [28498.825625] sata_promise 0000:01:08.0: PCI INT A -> Link[APC3] -> GSI 18 (level, high) -> IRQ 18 [28498.826212] scsi8 : sata_promise [28498.827942] scsi9 : sata_promise [28498.828206] scsi10 : sata_promise [28498.828716] scsi11 : sata_promise [28498.828838] ata9: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024380 irq 18 [28498.828845] ata10: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024280 irq 18 [28498.828850] ata11: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024200 irq 18 [28498.828854] ata12: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024300 irq 18 [28504.210021] ata9: link is slow to respond, please be patient (ready=-19) [28508.830010] ata9: COMRESET failed (errno=-16) [28514.220012] ata9: link is slow to respond, please be patient (ready=-19) [28518.840009] ata9: COMRESET failed (errno=-16) [28524.230012] ata9: link is slow to respond, please be patient (ready=-19) [28553.870009] ata9: COMRESET failed (errno=-16) [28553.870019] ata9: limiting SATA link speed to 1.5 Gbps [28558.900010] ata9: COMRESET failed (errno=-16) [28558.900017] ata9: reset failed, giving up [28559.250043] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [28564.250027] ata10.00: qc timeout (cmd 0xec) [28564.250242] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4) [28569.640739] ata10: link is slow to respond, please be patient (ready=-19) [28574.260008] ata10: COMRESET failed (errno=-16) [28579.650015] ata10: link is slow to respond, please be patient (ready=-19) [28584.270008] ata10: COMRESET failed (errno=-16) [28589.660012] ata10: link is slow to respond, please be patient (ready=-19) [28619.300008] ata10: COMRESET failed (errno=-16) [28619.300018] ata10: limiting SATA link speed to 1.5 Gbps [28624.330009] ata10: COMRESET failed (errno=-16) [28624.330018] ata10: reset failed, giving up [28624.330037] ata10: hard resetting link [28629.720011] ata10: link is slow to respond, please be patient (ready=-19) [28634.340029] ata10: COMRESET failed (errno=-16) [28634.340044] ata10: hard resetting link [28639.730011] ata10: link is slow to respond, please be patient (ready=-19) [28644.350016] ata10: COMRESET failed (errno=-16) [28644.350032] ata10: hard resetting link [28649.740018] ata10: link is slow to respond, please be patient (ready=-19) [28679.380015] ata10: COMRESET failed (errno=-16) [28679.380032] ata10: hard resetting link [28684.410011] ata10: COMRESET failed (errno=-16) [28684.410019] ata10: reset failed, giving up [28684.410026] ata10: EH complete [28684.760044] ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [28689.760028] ata11.00: qc timeout (cmd 0xec) [28689.760242] ata11.00: failed to IDENTIFY (I/O error, err_mask=0x4) [28695.150021] ata11: link is slow to respond, please be patient (ready=-19) [28699.770010] ata11: COMRESET failed (errno=-16) [28705.160064] ata11: link is slow to respond, please be patient (ready=-19) [28709.780011] ata11: COMRESET failed (errno=-16) [28715.170015] ata11: link is slow to respond, please be patient (ready=-19) [28744.810010] ata11: COMRESET failed (errno=-16) [28744.810019] ata11: limiting SATA link speed to 1.5 Gbps [28749.840009] ata11: COMRESET failed (errno=-16) [28749.840016] ata11: reset failed, giving up [28749.840038] ata11: hard resetting link [28755.230012] ata11: link is slow to respond, please be patient (ready=-19) [28759.850009] ata11: COMRESET failed (errno=-16) [28759.850023] ata11: hard resetting link [28765.240010] ata11: link is slow to respond, please be patient (ready=-19) [28769.860011] ata11: COMRESET failed (errno=-16) [28769.860026] ata11: hard resetting link [28775.250010] ata11: link is slow to respond, please be patient (ready=-19) [28804.890008] ata11: COMRESET failed (errno=-16) [28804.890024] ata11: hard resetting link [28809.920011] ata11: COMRESET failed (errno=-16) [28809.920019] ata11: reset failed, giving up [28809.920026] ata11: EH complete [28810.270043] ata12: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [28815.270028] ata12.00: qc timeout (cmd 0xec) [28815.270244] ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4) [28820.660011] ata12: link is slow to respond, please be patient (ready=-19) [28825.280010] ata12: COMRESET failed (errno=-16) [28830.670011] ata12: link is slow to respond, please be patient (ready=-19) [28835.290010] ata12: COMRESET failed (errno=-16) [28840.680011] ata12: link is slow to respond, please be patient (ready=-19) [28870.320011] ata12: COMRESET failed (errno=-16) [28870.320021] ata12: limiting SATA link speed to 1.5 Gbps [28875.350016] ata12: COMRESET failed (errno=-16) [28875.350024] ata12: reset failed, giving up [28875.350046] ata12: hard resetting link [28880.740018] ata12: link is slow to respond, please be patient (ready=-19) [28885.360018] ata12: COMRESET failed (errno=-16) [28885.360034] ata12: hard resetting link [28890.750011] ata12: link is slow to respond, please be patient (ready=-19) [28895.370015] ata12: COMRESET failed (errno=-16) [28895.370031] ata12: hard resetting link [28900.760011] ata12: link is slow to respond, please be patient (ready=-19) [28930.400011] ata12: COMRESET failed (errno=-16) [28930.400028] ata12: hard resetting link [28935.430011] ata12: COMRESET failed (errno=-16) [28935.430020] ata12: reset failed, giving up [28935.430027] ata12: EH complete 6. then I tried modprobe -r and modprobe once again but without success. [29272.494502] sata_promise 0000:01:08.0: PCI INT A disabled [29278.383748] sata_promise 0000:01:08.0: version 2.12 [29278.385439] sata_promise 0000:01:08.0: PCI INT A -> Link[APC3] -> GSI 18 (level, high) -> IRQ 18 [29278.386014] scsi12 : sata_promise [29278.387767] scsi13 : sata_promise [29278.388031] scsi14 : sata_promise [29278.388544] scsi15 : sata_promise [29278.388665] ata13: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024380 irq 18 [29278.388671] ata14: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024280 irq 18 [29278.388676] ata15: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024200 irq 18 [29278.388680] ata16: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024300 irq 18 [29283.770010] ata13: link is slow to respond, please be patient (ready=-19) [29288.390023] ata13: COMRESET failed (errno=-16) [29293.780010] ata13: link is slow to respond, please be patient (ready=-19) [29298.400011] ata13: COMRESET failed (errno=-16) [29303.790019] ata13: link is slow to respond, please be patient (ready=-19) [29333.430019] ata13: COMRESET failed (errno=-16) [29333.430029] ata13: limiting SATA link speed to 1.5 Gbps [29338.460010] ata13: COMRESET failed (errno=-16) [29338.460018] ata13: reset failed, giving up [29343.850019] ata14: link is slow to respond, please be patient (ready=-19) [29348.470020] ata14: COMRESET failed (errno=-16) [29353.860011] ata14: link is slow to respond, please be patient (ready=-19) [29358.480011] ata14: COMRESET failed (errno=-16) [29363.870009] ata14: link is slow to respond, please be patient (ready=-19) [29393.510011] ata14: COMRESET failed (errno=-16) [29393.510021] ata14: limiting SATA link speed to 1.5 Gbps [29398.540018] ata14: COMRESET failed (errno=-16) [29398.540026] ata14: reset failed, giving up [29403.930010] ata15: link is slow to respond, please be patient (ready=-19) [29408.550022] ata15: COMRESET failed (errno=-16) [29413.940023] ata15: link is slow to respond, please be patient (ready=-19) [29418.560013] ata15: COMRESET failed (errno=-16) [29423.950015] ata15: link is slow to respond, please be patient (ready=-19) [29453.590010] ata15: COMRESET failed (errno=-16) [29453.590019] ata15: limiting SATA link speed to 1.5 Gbps [29458.620009] ata15: COMRESET failed (errno=-16) [29458.620017] ata15: reset failed, giving up [29464.010010] ata16: link is slow to respond, please be patient (ready=-19) [29468.630018] ata16: COMRESET failed (errno=-16) [29474.020009] ata16: link is slow to respond, please be patient (ready=-19) [29478.640743] ata16: COMRESET failed (errno=-16) [29484.030010] ata16: link is slow to respond, please be patient (ready=-19) [29513.670012] ata16: COMRESET failed (errno=-16) [29513.670022] ata16: limiting SATA link speed to 1.5 Gbps [29518.700010] ata16: COMRESET failed (errno=-16) [29518.700018] ata16: reset failed, giving up Best regards, Peter ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-18 18:03 ` Peter Favrholdt @ 2008-11-19 1:55 ` Tejun Heo 2008-11-20 10:22 ` Peter Favrholdt 0 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2008-11-19 1:55 UTC (permalink / raw) To: Peter Favrholdt; +Cc: linux-ide [-- Attachment #1: Type: text/plain, Size: 1955 bytes --] Peter Favrholdt wrote: > Hi Tejun & list > > Tejun Heo wrote: >> Peter Favrholdt wrote: >>> I tried removing sata_promise the nice way using modprobe: >>> >>> # modprobe -r -f sata_promise >>> FATAL: Module sata_promise is in use. >>> >>> then the not-so-nice way using rmmod -f and inserting it again. It >>> doesn't work. Here are the relevant parts from dmesg (removing and >>> inserting): >> >> Oh... you first need to wait for the kernel to be done with the devices. >> ie. unmount all filesystems living on those && wait till EH finishes. > > I didn't have any filesystems mounted but learned that the 4 disks > participate in a md raid5 set. To be sure I tried everything again from > the beginning (reboot) without enabling md on the four drives. Here are > the results: > > 1. started the four dd processes which ran to completion without any > errors. Started them again and this time sda failed as usual. > > 2. modprobe -r sata_promise - no warnings/errors: removed successfully > > 3. modprobe sata_promise - only three drives show up > > 4. modprobe -r sata_promise - no warnings/errors: removed successfully > > 5. modprobe sata_promise - more channels complain - no drives are found > > dmesg: > > [21626.490086] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 > action 0x6 frozen > [21626.490096] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } > [21626.490105] ata1.00: cmd 25/00:00:00:1f:e6/00:02:01:00:00/e0 tag 0 > dma 262144 in > [21626.490107] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask > 0x4 (timeout) > [21626.490110] ata1.00: status: { DRDY } > [21626.490162] ata1: hard resetting link > [21631.881055] ata1: link is slow to respond, please be patient (ready=-19) > [21636.500030] ata1: COMRESET failed (errno=-16) > [21636.500068] ata1: hard resetting link COMRESETs are failing with EBUSY while ata_sff_check_ready() is returning ENODEV. Hmmm... Does the attached patch change anything? -- tejun [-- Attachment #2: sata_promise-hrst-debug.patch --] [-- Type: text/x-patch, Size: 707 bytes --] diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c index ba9a257..a06af2c 100644 --- a/drivers/ata/sata_promise.c +++ b/drivers/ata/sata_promise.c @@ -709,8 +709,13 @@ static int pdc_pata_softreset(struct ata_link *link, unsigned int *class, static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, unsigned long deadline) { + const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context); + bool online; + int rc; + pdc_reset_port(link->ap); - return sata_sff_hardreset(link, class, deadline); + rc = sata_link_hardreset(link, timing, deadline, &online, NULL); + return online ? -EAGAIN : rc; } static void pdc_error_handler(struct ata_port *ap) ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-19 1:55 ` Tejun Heo @ 2008-11-20 10:22 ` Peter Favrholdt 2008-11-20 11:10 ` Mikael Pettersson 0 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-20 10:22 UTC (permalink / raw) To: linux-ide; +Cc: Tejun Heo Hi Tejun and list, Tejun Heo wrote: > COMRESETs are failing with EBUSY while ata_sff_check_ready() is > returning ENODEV. Hmmm... Does the attached patch change anything? patching file drivers/ata/sata_promise.c Hunk #1 succeeded at 707 (offset -2 lines). Yes, it actually helped. Tested it twice and both times the following happened: 1. started the dd's from all four drives 2. /dev/sda fails after a while as usual 3. two hardresets happen on sda - the second limits the link to 1.5Gbps 4. dd continues without hickups (and when completed normally I started yet another dd test which ran to completion without any errors). So at first error the hardreset kicks in and saves the day (and downgrades to 1.5Gbps which prevents the problem from happening in my setup). Thanks a lot! Best regards, Peter Favrholdt Here is the dmesg: [ 648.683865] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x180000 action 0x6 [ 648.683872] ata1.00: port_status 0x20280000 [ 648.683876] ata1: SError: { 10B8B Dispar } [ 648.683884] ata1.00: cmd c8/00:00:00:0e:e6/00:00:00:00:00/e1 tag 0 dma 131072 in [ 648.683886] res 51/84:00:00:0e:e6/00:00:00:00:00/e1 Emask 0x12 (ATA bus error) [ 648.683889] ata1.00: status: { DRDY ERR } [ 648.683891] ata1.00: error: { ICRC ABRT } [ 648.683942] ata1: hard resetting link [ 649.190217] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 649.334082] ata1.00: configured for UDMA/133 [ 649.334354] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t4 [ 649.334357] ata1: hotplug_status 0x80 [ 649.492347] ata1.00: configured for UDMA/133 [ 649.492363] ata1: EH complete [ 649.537356] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [ 649.540275] sd 0:0:0:0: [sda] Write Protect is off [ 649.540284] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 649.547891] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 649.555009] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [ 649.557709] sd 0:0:0:0: [sda] Write Protect is off [ 649.557719] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 649.991649] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x180000 action 0x6 [ 649.991657] ata1.00: port_status 0x20280000 [ 649.991662] ata1: SError: { 10B8B Dispar } [ 649.991669] ata1.00: cmd c8/00:00:00:17:e6/00:00:00:00:00/e1 tag 0 dma 131072 in [ 649.991671] res 51/84:00:00:17:e6/00:00:00:00:00/e1 Emask 0x12 (ATA bus error) [ 649.991674] ata1.00: status: { DRDY ERR } [ 649.991676] ata1.00: error: { ICRC ABRT } [ 649.991712] ata1: hard resetting link [ 650.500230] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 650.641824] ata1.00: configured for UDMA/133 [ 650.642194] ata1: limiting SATA link speed to 1.5 Gbps [ 650.642200] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xf t4 [ 650.642202] ata1: hotplug_status 0x80 [ 650.642252] ata1: hard resetting link [ 651.550344] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [ 651.700416] ata1.00: configured for UDMA/133 [ 651.700678] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t3 [ 651.700681] ata1: hotplug_status 0x80 [ 651.882957] ata1.00: configured for UDMA/133 [ 651.882973] ata1: EH complete [ 651.931329] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 651.939665] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [ 651.943942] sd 0:0:0:0: [sda] Write Protect is off [ 651.943952] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 651.952162] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-20 10:22 ` Peter Favrholdt @ 2008-11-20 11:10 ` Mikael Pettersson 2008-11-21 4:42 ` Tejun Heo 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo 0 siblings, 2 replies; 31+ messages in thread From: Mikael Pettersson @ 2008-11-20 11:10 UTC (permalink / raw) To: Peter Favrholdt; +Cc: linux-ide, Tejun Heo Peter Favrholdt writes: > Hi Tejun and list, > > Tejun Heo wrote: > > COMRESETs are failing with EBUSY while ata_sff_check_ready() is > > returning ENODEV. Hmmm... Does the attached patch change anything? > > patching file drivers/ata/sata_promise.c > Hunk #1 succeeded at 707 (offset -2 lines). > > Yes, it actually helped. Tested it twice and both times the following > happened: > > 1. started the dd's from all four drives > > 2. /dev/sda fails after a while as usual > > 3. two hardresets happen on sda - the second limits the link to 1.5Gbps > > 4. dd continues without hickups (and when completed normally I started > yet another dd test which ran to completion without any errors). > > So at first error the hardreset kicks in and saves the day (and > downgrades to 1.5Gbps which prevents the problem from happening in my > setup). > > Thanks a lot! Interesting. Tejun, what's the functional difference between the original sata reset method and the one you changed it to? > [ 649.991712] ata1: hard resetting link > [ 650.500230] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [ 650.641824] ata1.00: configured for UDMA/133 > [ 650.642194] ata1: limiting SATA link speed to 1.5 Gbps > [ 650.642200] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xf t4 > [ 650.642202] ata1: hotplug_status 0x80 > [ 650.642252] ata1: hard resetting link > [ 651.550344] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > [ 651.700416] ata1.00: configured for UDMA/133 > [ 651.700678] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t3 > [ 651.700681] ata1: hotplug_status 0x80 These hotplug_status things worry me. Hard resets trigger hotplug events, but all recent kernels should disable them in ->freeze. This used to be the case. Either libata isn't freezing the port before the reset, or ->freeze isn't marking the port properly causing the interrupt handler to interpret (silent) hotplug status bits when it shouldn't, or (horrors) ->freeze isn't working any more. Stray hotplug events prolong EH handling sequences and can even cause port disabling due to EH failures. /Mikael ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux 2008-11-20 11:10 ` Mikael Pettersson @ 2008-11-21 4:42 ` Tejun Heo 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo 1 sibling, 0 replies; 31+ messages in thread From: Tejun Heo @ 2008-11-21 4:42 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Peter Favrholdt, linux-ide Hello, Mikael. Mikael Pettersson wrote: > Tejun, what's the functional difference between the original > sata reset method and the one you changed it to? It means that the controller is having trouble updating the TF registers from the first D2H Reg FIS after hardreset and thus requires a follow-up SRST to get the device signature and initial DRDY status right. This is actually quite common among SATA controllers. Some ahci's and sata_sil24 can't do it either. sata_sil is one of the best behaving ones in this regard. It looks like sata_promise needs follow-up SRST after all. I'll put the header and send this patch to Jeff. > > [ 649.991712] ata1: hard resetting link > > [ 650.500230] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > > [ 650.641824] ata1.00: configured for UDMA/133 > > [ 650.642194] ata1: limiting SATA link speed to 1.5 Gbps > > [ 650.642200] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xf t4 > > [ 650.642202] ata1: hotplug_status 0x80 > > [ 650.642252] ata1: hard resetting link > > [ 651.550344] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > > [ 651.700416] ata1.00: configured for UDMA/133 > > [ 651.700678] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t3 > > [ 651.700681] ata1: hotplug_status 0x80 > > These hotplug_status things worry me. Hard resets trigger > hotplug events, but all recent kernels should disable them > in ->freeze. This used to be the case. > > Either libata isn't freezing the port before the reset, > or ->freeze isn't marking the port properly causing the > interrupt handler to interpret (silent) hotplug status bits > when it shouldn't, or (horrors) ->freeze isn't working any more. libata definitely is calling ->freeze but some controllers seem to ignore IRQ masking during hardreset. Maybe promise falls in this category too? For those controllers, irq handler should be ready for spurious interrupts during reset. If anything needs to be done differently, it can use ATA_PFLAG_RESETTING to tell whether reset is in progress (see ahci.c for example). > Stray hotplug events prolong EH handling sequences and can even > cause port disabling due to EH failures. libata EH should be smart enough to handle this if the PHY event happens during reset and scheduled reset. EH will clear those actions on successful completion of reset but before verifying the connected devices (spurious events are ignored w/o losing actual device presence). Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-20 11:10 ` Mikael Pettersson 2008-11-21 4:42 ` Tejun Heo @ 2008-11-21 4:56 ` Tejun Heo 2008-11-22 16:30 ` Mikael Pettersson ` (3 more replies) 1 sibling, 4 replies; 31+ messages in thread From: Tejun Heo @ 2008-11-21 4:56 UTC (permalink / raw) To: Mikael Pettersson, Jeff Garzik; +Cc: Peter Favrholdt, linux-ide sata_promise hardreset doesn't seem to be able to acquire the initial D2H Reg FIS after hardreset leading to hardreset timeouts. Request follow-up SRST. http://article.gmane.org/gmane.linux.ide/36186 Signed-off-by: Tejun Heo <tj@kernel.org> --- Peter, can you please test this one too? It's essentially the same code just slightly prettier. Mikael, what do you think about this? Jeff, please commit only after Peter's verification and Mikael's ACK. Thanks. drivers/ata/sata_promise.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c index ba9a257..917038a 100644 --- a/drivers/ata/sata_promise.c +++ b/drivers/ata/sata_promise.c @@ -710,7 +710,12 @@ static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, unsigned long deadline) { pdc_reset_port(link->ap); - return sata_sff_hardreset(link, class, deadline); + + /* sata_promise can't reliably acquire the first D2H Reg FIS + * after hardreset. Do non-waiting hardreset and request + * follow-up SRST. + */ + return sata_std_hardreset(link, class, deadline); } static void pdc_error_handler(struct ata_port *ap) ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo @ 2008-11-22 16:30 ` Mikael Pettersson 2008-11-23 22:38 ` Peter Favrholdt ` (2 subsequent siblings) 3 siblings, 0 replies; 31+ messages in thread From: Mikael Pettersson @ 2008-11-22 16:30 UTC (permalink / raw) To: Tejun Heo; +Cc: Mikael Pettersson, Jeff Garzik, Peter Favrholdt, linux-ide Tejun Heo writes: > sata_promise hardreset doesn't seem to be able to acquire the initial > D2H Reg FIS after hardreset leading to hardreset timeouts. Request > follow-up SRST. > > http://article.gmane.org/gmane.linux.ide/36186 > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > Peter, can you please test this one too? It's essentially the same > code just slightly prettier. Mikael, what do you think about this? Works fine in my test machine. Acked-by: Mikael Pettersson <mikpe@it.uu.se> > Jeff, please commit only after Peter's verification and Mikael's ACK. > > Thanks. > > drivers/ata/sata_promise.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c > index ba9a257..917038a 100644 > --- a/drivers/ata/sata_promise.c > +++ b/drivers/ata/sata_promise.c > @@ -710,7 +710,12 @@ static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, > unsigned long deadline) > { > pdc_reset_port(link->ap); > - return sata_sff_hardreset(link, class, deadline); > + > + /* sata_promise can't reliably acquire the first D2H Reg FIS > + * after hardreset. Do non-waiting hardreset and request > + * follow-up SRST. > + */ > + return sata_std_hardreset(link, class, deadline); > } > > static void pdc_error_handler(struct ata_port *ap) > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo 2008-11-22 16:30 ` Mikael Pettersson @ 2008-11-23 22:38 ` Peter Favrholdt 2008-11-25 13:00 ` Peter Favrholdt 2008-11-25 17:27 ` Jeff Garzik 3 siblings, 0 replies; 31+ messages in thread From: Peter Favrholdt @ 2008-11-23 22:38 UTC (permalink / raw) To: Tejun Heo, linux-ide Hi Tejun and list, This mail just to say I'm alive and testing the patch but so far no errors (on my hardware). Tejun Heo wrote: > sata_promise hardreset doesn't seem to be able to acquire the initial > D2H Reg FIS after hardreset leading to hardreset timeouts. Request > follow-up SRST. > > http://article.gmane.org/gmane.linux.ide/36186 > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > Peter, can you please test this one too? It's essentially the same > code just slightly prettier. Mikael, what do you think about this? I'm still testing the new patch (on 2.6.27.7), but so far I haven't had any failures - it seems my hardware has started to behave well - very unfortunate ;-) I'll keep on testing - hopefully I'll see some of the usual errors soon - and the new patch recovers alright. Best regards, I'm still testing the new patch (on 2.6.27.7), but so far I haven't had any failures - it seems my hardware has started to behave well - very unfortunate ;-) > Jeff, please commit only after Peter's verification and Mikael's ACK. I'll keep on testing - hopefully I'll see some of the usual errors soon - and the new patch will recover alright. I'll keep you updated. Best regards, Peter > drivers/ata/sata_promise.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c > index ba9a257..917038a 100644 > --- a/drivers/ata/sata_promise.c > +++ b/drivers/ata/sata_promise.c > @@ -710,7 +710,12 @@ static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, > unsigned long deadline) > { > pdc_reset_port(link->ap); > - return sata_sff_hardreset(link, class, deadline); > + > + /* sata_promise can't reliably acquire the first D2H Reg FIS > + * after hardreset. Do non-waiting hardreset and request > + * follow-up SRST. > + */ > + return sata_std_hardreset(link, class, deadline); > } > > static void pdc_error_handler(struct ata_port *ap) ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo 2008-11-22 16:30 ` Mikael Pettersson 2008-11-23 22:38 ` Peter Favrholdt @ 2008-11-25 13:00 ` Peter Favrholdt 2008-11-26 2:46 ` Tejun Heo 2008-11-25 17:27 ` Jeff Garzik 3 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-25 13:00 UTC (permalink / raw) To: linux-ide; +Cc: Tejun Heo, Mikael Pettersson, Jeff Garzik Hi Tejun, After running dd several times finally my setup failed. Unfortunately it didn't recover :-( Tejun Heo wrote: > sata_promise hardreset doesn't seem to be able to acquire the initial > D2H Reg FIS after hardreset leading to hardreset timeouts. Request > follow-up SRST. > > http://article.gmane.org/gmane.linux.ide/36186 > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > Peter, can you please test this one too? It's essentially the same > code just slightly prettier. Mikael, what do you think about this? /usr/src/linux-2.6.27.7# patch -p1 <../sata_promise_tejun_heo_2008-11-21.patch patching file drivers/ata/sata_promise.c Hunk #1 succeeded at 708 (offset -2 lines). Here is dmesg: [133424.882458] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x6 frozen [133424.882468] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } [133424.882477] ata1.00: cmd c8/00:00:a0:2c:e6/00:00:00:00:00/e1 tag 0 dma 131072 in [133424.882478] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [133424.882482] ata1.00: status: { DRDY } [133424.882522] ata1: hard resetting link [133430.430034] ata1: link is slow to respond, please be patient (ready=-19) [133434.930042] ata1: SRST failed (errno=-16) [133434.930137] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133439.950170] ata1.00: qc timeout (cmd 0xec) [133439.950424] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [133439.950428] ata1.00: revalidation failed (errno=-5) [133439.950480] ata1: hard resetting link [133440.460206] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133440.614296] ata1.00: configured for UDMA/133 [133440.614557] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t4 [133440.614560] ata1: hotplug_status 0x80 [133440.764137] ata1.00: configured for UDMA/133 [133440.764152] ata1: EH complete [133440.825887] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) [133440.830103] sd 0:0:0:0: [sda] Write Protect is off [133440.830113] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [133440.838448] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [133470.830077] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x6 frozen [133470.830087] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } [133470.830096] ata1.00: cmd c8/00:00:a0:31:e6/00:00:00:00:00/e1 tag 0 dma 131072 in [133470.830098] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [133470.830101] ata1.00: status: { DRDY } [133470.830131] ata1: hard resetting link [133476.380130] ata1: link is slow to respond, please be patient (ready=-19) [133480.881592] ata1: SRST failed (errno=-16) [133480.881690] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133485.882726] ata1.00: qc timeout (cmd 0xec) [133485.882980] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [133485.882985] ata1.00: revalidation failed (errno=-5) [133485.883019] ata1: hard resetting link [133491.430035] ata1: link is slow to respond, please be patient (ready=-19) [133495.930034] ata1: SRST failed (errno=-16) [133495.930137] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133505.951635] ata1.00: qc timeout (cmd 0xec) [133505.951875] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [133505.951879] ata1.00: revalidation failed (errno=-5) [133505.951930] ata1: hard resetting link [133511.500191] ata1: link is slow to respond, please be patient (ready=-19) [133516.010031] ata1: SRST failed (errno=-16) [133516.010147] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133546.030141] ata1.00: qc timeout (cmd 0xec) [133546.030373] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [133546.030377] ata1.00: revalidation failed (errno=-5) [133546.030380] ata1.00: disabled [133546.030406] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t4 [133546.030409] ata1: hotplug_status 0x80 [133546.030456] ata1: hard resetting link [133551.980075] ata1: link is slow to respond, please be patient (ready=-19) [133556.060075] ata1: SRST failed (errno=-16) [133556.060184] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133556.060231] ata1: link online but device misclassified, retrying [133556.060265] ata1: hard resetting link [133562.010032] ata1: link is slow to respond, please be patient (ready=-19) [133566.100094] ata1: SRST failed (errno=-16) [133566.100195] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133566.100242] ata1: link online but device misclassified, retrying [133566.100285] ata1: hard resetting link [133572.050066] ata1: link is slow to respond, please be patient (ready=-19) [133601.160039] ata1: SRST failed (errno=-16) [133601.160150] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133601.160190] ata1: link online but device misclassified, retrying [133601.160198] ata1: limiting SATA link speed to 1.5 Gbps [133601.160247] ata1: hard resetting link [133606.220108] ata1: SRST failed (errno=-16) [133606.220209] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [133606.220250] ata1: link online but device misclassified, device detection might fail [133606.220520] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t3 [133606.220523] ata1: hotplug_status 0x80 [133606.220559] ata1: hard resetting link [133612.170073] ata1: link is slow to respond, please be patient (ready=-19) [133616.250074] ata1: SRST failed (errno=-16) [133616.250176] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133616.250218] ata1: link online but device misclassified, retrying [133616.250259] ata1: hard resetting link [133622.200033] ata1: link is slow to respond, please be patient (ready=-19) [133626.280039] ata1: SRST failed (errno=-16) [133626.280117] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133626.280151] ata1: link online but device misclassified, retrying [133626.280297] ata1: hard resetting link [133632.230039] ata1: link is slow to respond, please be patient (ready=-19) [133661.340061] ata1: SRST failed (errno=-16) [133661.340127] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133661.340174] ata1: link online but device misclassified, retrying [133661.340181] ata1: limiting SATA link speed to 1.5 Gbps [133661.340216] ata1: hard resetting link [133666.400844] ata1: SRST failed (errno=-16) [133666.400939] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [133666.400980] ata1: link online but device misclassified, device detection might fail [133666.401240] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t2 [133666.401243] ata1: hotplug_status 0x80 [133666.401301] ata1: hard resetting link [133672.360283] ata1: link is slow to respond, please be patient (ready=-19) [133676.440038] ata1: SRST failed (errno=-16) [133676.440126] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133676.440159] ata1: link online but device misclassified, retrying [133676.440206] ata1: hard resetting link [133682.390310] ata1: link is slow to respond, please be patient (ready=-19) [133686.470083] ata1: SRST failed (errno=-16) [133686.470198] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133686.470239] ata1: link online but device misclassified, retrying [133686.470273] ata1: hard resetting link [133692.430263] ata1: link is slow to respond, please be patient (ready=-19) [133721.530036] ata1: SRST failed (errno=-16) [133721.530138] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133721.530187] ata1: link online but device misclassified, retrying [133721.530194] ata1: limiting SATA link speed to 1.5 Gbps [133721.530235] ata1: hard resetting link [133726.580952] ata1: SRST failed (errno=-16) [133726.581062] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [133726.581095] ata1: link online but device misclassified, device detection might fail [133726.581358] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t1 [133726.581361] ata1: hotplug_status 0x80 [133726.581412] ata1: hard resetting link [133732.531480] ata1: link is slow to respond, please be patient (ready=-19) [133736.610030] ata1: SRST failed (errno=-16) [133736.610126] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133736.610173] ata1: link online but device misclassified, retrying [133736.610207] ata1: hard resetting link [133742.560043] ata1: link is slow to respond, please be patient (ready=-19) [133746.640042] ata1: SRST failed (errno=-16) [133746.640120] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133746.640252] ata1: link online but device misclassified, retrying [133746.640272] ata1: hard resetting link [133752.600033] ata1: link is slow to respond, please be patient (ready=-19) [133781.650358] ata1: SRST failed (errno=-16) [133781.650380] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [133781.650389] ata1: link online but device misclassified, retrying [133781.650394] ata1: limiting SATA link speed to 1.5 Gbps [133781.650402] ata1: hard resetting link [133786.700144] ata1: SRST failed (errno=-16) [133786.700247] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [133786.700279] ata1: link online but device misclassified, device detection might fail [133786.700288] ata1: EH pending after 5 tries, giving up [133786.700303] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 [133786.700307] sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] [133786.700312] Descriptor sense data with sense descriptors (in hex): [133786.700314] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [133786.700321] 00 00 00 00 [133786.700324] sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 [133786.700327] end_request: I/O error, dev sda, sector 31863200 [133786.700332] Buffer I/O error on device sda, logical block 3982900 [133786.700337] Buffer I/O error on device sda, logical block 3982901 [133786.700341] Buffer I/O error on device sda, logical block 3982902 [133786.700344] Buffer I/O error on device sda, logical block 3982903 [133786.700347] Buffer I/O error on device sda, logical block 3982904 [133786.700350] Buffer I/O error on device sda, logical block 3982905 [133786.700353] Buffer I/O error on device sda, logical block 3982906 [133786.700356] Buffer I/O error on device sda, logical block 3982907 [133786.700360] Buffer I/O error on device sda, logical block 3982908 [133786.700363] Buffer I/O error on device sda, logical block 3982909 [133786.700384] sd 0:0:0:0: rejecting I/O to offline device [133786.700390] sd 0:0:0:0: rejecting I/O to offline device [133786.700422] ata1: EH complete [133786.700444] sd 0:0:0:0: rejecting I/O to offline device [133786.700451] sd 0:0:0:0: rejecting I/O to offline device [133786.700457] sd 0:0:0:0: rejecting I/O to offline device [133786.700460] sd 0:0:0:0: [sda] READ CAPACITY failed [133786.700463] sd 0:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00 [133786.700466] sd 0:0:0:0: [sda] Sense not available. [133786.700471] sd 0:0:0:0: rejecting I/O to offline device [133786.700476] sd 0:0:0:0: [sda] Write Protect is off [133786.700478] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00 [133786.700483] sd 0:0:0:0: rejecting I/O to offline device [133786.700487] sd 0:0:0:0: [sda] Asking for cache data failed [133786.700489] sd 0:0:0:0: [sda] Assuming drive cache: write through [133786.700496] ata1.00: detaching (SCSI 0:0:0:0) [133786.701352] sd 0:0:0:0: [sda] Stopping disk [133786.702316] sd 0:0:0:0: [sda] START_STOP FAILED [133786.702320] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 > Jeff, please commit only after Peter's verification and Mikael's ACK. > > Thanks. > > drivers/ata/sata_promise.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c > index ba9a257..917038a 100644 > --- a/drivers/ata/sata_promise.c > +++ b/drivers/ata/sata_promise.c > @@ -710,7 +710,12 @@ static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, > unsigned long deadline) > { > pdc_reset_port(link->ap); > - return sata_sff_hardreset(link, class, deadline); > + > + /* sata_promise can't reliably acquire the first D2H Reg FIS > + * after hardreset. Do non-waiting hardreset and request > + * follow-up SRST. > + */ > + return sata_std_hardreset(link, class, deadline); > } > > static void pdc_error_handler(struct ata_port *ap) > -- > To unsubscribe from this list: send the line "unsubscribe linux-ide" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-25 13:00 ` Peter Favrholdt @ 2008-11-26 2:46 ` Tejun Heo 2008-11-26 8:12 ` Peter Favrholdt 0 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2008-11-26 2:46 UTC (permalink / raw) To: Peter Favrholdt; +Cc: linux-ide, Mikael Pettersson, Jeff Garzik Peter Favrholdt wrote: > Hi Tejun, > > After running dd several times finally my setup failed. Unfortunately it > didn't recover :-( Eh... crap. > Tejun Heo wrote: >> sata_promise hardreset doesn't seem to be able to acquire the initial >> D2H Reg FIS after hardreset leading to hardreset timeouts. Request >> follow-up SRST. >> >> http://article.gmane.org/gmane.linux.ide/36186 >> >> Signed-off-by: Tejun Heo <tj@kernel.org> >> --- >> Peter, can you please test this one too? It's essentially the same >> code just slightly prettier. Mikael, what do you think about this? Does unloading and reloading the driver make the device recognized again? Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-26 2:46 ` Tejun Heo @ 2008-11-26 8:12 ` Peter Favrholdt 2008-11-26 23:07 ` Peter Favrholdt 0 siblings, 1 reply; 31+ messages in thread From: Peter Favrholdt @ 2008-11-26 8:12 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, Mikael Pettersson, Jeff Garzik Tejun Heo wrote: > Peter Favrholdt wrote: >> Hi Tejun, >> >> After running dd several times finally my setup failed. Unfortunately it >> didn't recover :-( > > Eh... crap. > >> Tejun Heo wrote: >>> sata_promise hardreset doesn't seem to be able to acquire the initial >>> D2H Reg FIS after hardreset leading to hardreset timeouts. Request >>> follow-up SRST. >>> >>> http://article.gmane.org/gmane.linux.ide/36186 >>> >>> Signed-off-by: Tejun Heo <tj@kernel.org> >>> --- >>> Peter, can you please test this one too? It's essentially the same >>> code just slightly prettier. Mikael, what do you think about this? > > Does unloading and reloading the driver make the device recognized > again? Yes I tried modprobe -r sata_promise and modprobe sata_promise: It didn't recognize the device. I forgot to post the dmesg output, sorry. Best regards, Peter ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-26 8:12 ` Peter Favrholdt @ 2008-11-26 23:07 ` Peter Favrholdt 0 siblings, 0 replies; 31+ messages in thread From: Peter Favrholdt @ 2008-11-26 23:07 UTC (permalink / raw) To: linux-ide; +Cc: Tejun Heo, Mikael Pettersson, Jeff Garzik Hi Tejun and list, Replying to my own mail, but now with the dmesg output which was missing earlier: Peter Favrholdt wrote: > Tejun Heo wrote: >> Peter Favrholdt wrote: >>> After running dd several times finally my setup failed. Unfortunately it >>> didn't recover :-( >> >> Eh... crap. >> >>> Tejun Heo wrote: >>>> sata_promise hardreset doesn't seem to be able to acquire the initial >>>> D2H Reg FIS after hardreset leading to hardreset timeouts. Request >>>> follow-up SRST. >>>> >>>> http://article.gmane.org/gmane.linux.ide/36186 >>>> >>>> Signed-off-by: Tejun Heo <tj@kernel.org> >>>> --- >>>> Peter, can you please test this one too? It's essentially the same >>>> code just slightly prettier. Mikael, what do you think about this? >> >> Does unloading and reloading the driver make the device recognized >> again? > > Yes I tried modprobe -r sata_promise and modprobe sata_promise: It > didn't recognize the device. I forgot to post the dmesg output, sorry. Well, ran the tests again and here is the errors from the new run: [114851.970088] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x6 frozen [114851.970098] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns } [114851.970107] ata1.00: cmd 25/00:00:00:3f:e6/00:02:01:00:00/e0 tag 0 dma 262144 in [114851.970108] res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [114851.970112] ata1.00: status: { DRDY } [114851.970164] ata1: hard resetting link [114857.520147] ata1: link is slow to respond, please be patient (ready=-19) [114862.030032] ata1: SRST failed (errno=-16) [114862.030121] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114867.050164] ata1.00: qc timeout (cmd 0xec) [114867.050533] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [114867.050537] ata1.00: revalidation failed (errno=-5) [114867.050560] ata1: hard resetting link [114872.610029] ata1: link is slow to respond, please be patient (ready=-19) [114877.111605] ata1: SRST failed (errno=-16) [114877.111697] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114887.110131] ata1.00: qc timeout (cmd 0xec) [114887.110457] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [114887.110461] ata1.00: revalidation failed (errno=-5) [114887.110490] ata1: hard resetting link [114892.660241] ata1: link is slow to respond, please be patient (ready=-19) [114897.160046] ata1: SRST failed (errno=-16) [114897.160148] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114927.160160] ata1.00: qc timeout (cmd 0xec) [114927.160406] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5) [114927.160410] ata1.00: revalidation failed (errno=-5) [114927.160413] ata1.00: disabled [114927.160441] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t4 [114927.160444] ata1: hotplug_status 0x80 [114927.160487] ata1: hard resetting link [114933.110040] ata1: link is slow to respond, please be patient (ready=-19) [114937.190038] ata1: SRST failed (errno=-16) [114937.190132] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114937.190179] ata1: link online but device misclassified, retrying [114937.190227] ata1: hard resetting link [114943.140073] ata1: link is slow to respond, please be patient (ready=-19) [114947.220400] ata1: SRST failed (errno=-16) [114947.220508] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114947.220542] ata1: link online but device misclassified, retrying [114947.220576] ata1: hard resetting link [114953.170035] ata1: link is slow to respond, please be patient (ready=-19) [114982.280039] ata1: SRST failed (errno=-16) [114982.280148] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114982.280196] ata1: link online but device misclassified, retrying [114982.280205] ata1: limiting SATA link speed to 1.5 Gbps [114982.280255] ata1: hard resetting link [114987.330035] ata1: SRST failed (errno=-16) [114987.330145] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [114987.330178] ata1: link online but device misclassified, device detection might fail [114987.330438] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t3 [114987.330441] ata1: hotplug_status 0x80 [114987.330493] ata1: hard resetting link [114993.280035] ata1: link is slow to respond, please be patient (ready=-19) [114997.360597] ata1: SRST failed (errno=-16) [114997.360699] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [114997.360740] ata1: link online but device misclassified, retrying [114997.360781] ata1: hard resetting link [115003.330084] ata1: link is slow to respond, please be patient (ready=-19) [115007.410222] ata1: SRST failed (errno=-16) [115007.410327] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115007.410363] ata1: link online but device misclassified, retrying [115007.410399] ata1: hard resetting link [115013.360038] ata1: link is slow to respond, please be patient (ready=-19) [115042.460168] ata1: SRST failed (errno=-16) [115042.460271] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115042.460304] ata1: link online but device misclassified, retrying [115042.460313] ata1: limiting SATA link speed to 1.5 Gbps [115042.460346] ata1: hard resetting link [115047.510048] ata1: SRST failed (errno=-16) [115047.510143] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [115047.510176] ata1: link online but device misclassified, device detection might fail [115047.510524] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t2 [115047.510527] ata1: hotplug_status 0x80 [115047.510578] ata1: hard resetting link [115053.460097] ata1: link is slow to respond, please be patient (ready=-19) [115057.540039] ata1: SRST failed (errno=-16) [115057.540141] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115057.540175] ata1: link online but device misclassified, retrying [115057.540222] ata1: hard resetting link [115063.510044] ata1: link is slow to respond, please be patient (ready=-19) [115067.590036] ata1: SRST failed (errno=-16) [115067.590137] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115067.590178] ata1: link online but device misclassified, retrying [115067.590226] ata1: hard resetting link [115073.550062] ata1: link is slow to respond, please be patient (ready=-19) [115102.650031] ata1: SRST failed (errno=-16) [115102.650130] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115102.650264] ata1: link online but device misclassified, retrying [115102.650274] ata1: limiting SATA link speed to 1.5 Gbps [115102.650310] ata1: hard resetting link [115107.700035] ata1: SRST failed (errno=-16) [115107.700131] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [115107.700171] ata1: link online but device misclassified, device detection might fail [115107.700500] ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t1 [115107.700503] ata1: hotplug_status 0x80 [115107.700556] ata1: hard resetting link [115113.652245] ata1: link is slow to respond, please be patient (ready=-19) [115117.740038] ata1: SRST failed (errno=-16) [115117.740133] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115117.740173] ata1: link online but device misclassified, retrying [115117.740207] ata1: hard resetting link [115123.690028] ata1: link is slow to respond, please be patient (ready=-19) [115127.770479] ata1: SRST failed (errno=-16) [115127.770587] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115127.770626] ata1: link online but device misclassified, retrying [115127.770646] ata1: hard resetting link [115133.720038] ata1: link is slow to respond, please be patient (ready=-19) [115162.830077] ata1: SRST failed (errno=-16) [115162.830179] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115162.830212] ata1: link online but device misclassified, retrying [115162.830220] ata1: limiting SATA link speed to 1.5 Gbps [115162.830268] ata1: hard resetting link [115167.880253] ata1: SRST failed (errno=-16) [115167.880362] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [115167.880404] ata1: link online but device misclassified, device detection might fail [115167.880414] ata1: EH pending after 5 tries, giving up [115167.880436] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 [115167.880440] sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] [115167.880445] Descriptor sense data with sense descriptors (in hex): [115167.880447] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [115167.880454] 00 00 00 00 [115167.880457] sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 [115167.880461] end_request: I/O error, dev sda, sector 31866624 [115167.880465] Buffer I/O error on device sda, logical block 3983328 [115167.880470] Buffer I/O error on device sda, logical block 3983329 [115167.880473] Buffer I/O error on device sda, logical block 3983330 [115167.880476] Buffer I/O error on device sda, logical block 3983331 [115167.880479] Buffer I/O error on device sda, logical block 3983332 [115167.880483] Buffer I/O error on device sda, logical block 3983333 [115167.880486] Buffer I/O error on device sda, logical block 3983334 [115167.880489] Buffer I/O error on device sda, logical block 3983335 [115167.880492] Buffer I/O error on device sda, logical block 3983336 [115167.880496] Buffer I/O error on device sda, logical block 3983337 [115167.880644] ata1: EH complete [115167.880653] ata1.00: detaching (SCSI 0:0:0:0) [115167.880945] sd 0:0:0:0: [sda] Synchronizing SCSI cache [115167.883403] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 [115167.883412] sd 0:0:0:0: [sda] Stopping disk [115167.883895] sd 0:0:0:0: [sda] START_STOP FAILED [115167.883897] sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 modprobe -r sata_promise adds the following to dmesg: [115788.005895] ata2.00: disabled [115788.006119] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [115788.006153] sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [115788.006157] sd 1:0:0:0: [sdb] Stopping disk [115788.006165] sd 1:0:0:0: [sdb] START_STOP FAILED [115788.006167] sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 [115788.006438] ata3.00: disabled [115788.006588] sd 2:0:0:0: [sdc] Synchronizing SCSI cache [115788.006610] sd 2:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [115788.006614] sd 2:0:0:0: [sdc] Stopping disk [115788.006621] sd 2:0:0:0: [sdc] START_STOP FAILED [115788.006623] sd 2:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 [115788.006743] ata4.00: disabled [115788.006892] sd 3:0:0:0: [sdd] Synchronizing SCSI cache [115788.006914] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 [115788.006918] sd 3:0:0:0: [sdd] Stopping disk [115788.006925] sd 3:0:0:0: [sdd] START_STOP FAILED [115788.006927] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 [115788.007073] sata_promise 0000:01:08.0: PCI INT A disabled Then modprobe sata_promise adds to dmesg: [115974.655463] sata_promise 0000:01:08.0: version 2.12 [115974.655717] sata_promise 0000:01:08.0: PCI INT A -> Link[APC3] -> GSI 18 (level, high) -> IRQ 18 [115974.656113] scsi4 : sata_promise [115974.657795] scsi5 : sata_promise [115974.657947] scsi6 : sata_promise [115974.658400] scsi7 : sata_promise [115974.658466] ata5: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024380 irq 18 [115974.658473] ata6: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024280 irq 18 [115974.658477] ata7: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024200 irq 18 [115974.658482] ata8: SATA max UDMA/133 mmio m4096@0xe9024000 ata 0xe9024300 irq 18 [115980.200009] ata5: link is slow to respond, please be patient (ready=-19) [115984.700016] ata5: SRST failed (errno=-16) [115984.700040] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115984.700050] ata5: link online but device misclassified, retrying [115990.250010] ata5: link is slow to respond, please be patient (ready=-19) [115994.750011] ata5: SRST failed (errno=-16) [115994.750035] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [115994.750045] ata5: link online but device misclassified, retrying [116000.300013] ata5: link is slow to respond, please be patient (ready=-19) [116029.760011] ata5: SRST failed (errno=-16) [116029.760034] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116029.760043] ata5: link online but device misclassified, retrying [116029.760048] ata5: limiting SATA link speed to 1.5 Gbps [116034.770011] ata5: SRST failed (errno=-16) [116034.770035] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [116034.770045] ata5: link online but device misclassified, device detection might fail [116035.280132] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116040.280657] ata6.00: qc timeout (cmd 0xec) [116040.280871] ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4) [116045.830011] ata6: link is slow to respond, please be patient (ready=-19) [116050.330012] ata6: SRST failed (errno=-16) [116050.330034] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116050.330043] ata6: link online but device misclassified, retrying [116055.880744] ata6: link is slow to respond, please be patient (ready=-19) [116060.380009] ata6: SRST failed (errno=-16) [116060.380032] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116060.380042] ata6: link online but device misclassified, retrying [116065.930012] ata6: link is slow to respond, please be patient (ready=-19) [116095.390009] ata6: SRST failed (errno=-16) [116095.390032] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116095.390041] ata6: link online but device misclassified, retrying [116095.390046] ata6: limiting SATA link speed to 1.5 Gbps [116100.400009] ata6: SRST failed (errno=-16) [116100.400031] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [116100.400041] ata6: link online but device misclassified, device detection might fail [116100.910044] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116105.910028] ata7.00: qc timeout (cmd 0xec) [116105.910242] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4) [116111.460073] ata7: link is slow to respond, please be patient (ready=-19) [116115.960011] ata7: SRST failed (errno=-16) [116115.960033] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116115.960043] ata7: link online but device misclassified, retrying [116121.510016] ata7: link is slow to respond, please be patient (ready=-19) [116126.010016] ata7: SRST failed (errno=-16) [116126.010039] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116126.010048] ata7: link online but device misclassified, retrying [116131.560010] ata7: link is slow to respond, please be patient (ready=-19) [116161.020012] ata7: SRST failed (errno=-16) [116161.020035] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116161.020044] ata7: link online but device misclassified, retrying [116161.020049] ata7: limiting SATA link speed to 1.5 Gbps [116166.030012] ata7: SRST failed (errno=-16) [116166.030035] ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [116166.030044] ata7: link online but device misclassified, device detection might fail [116166.540047] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116171.540029] ata8.00: qc timeout (cmd 0xec) [116171.540244] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4) [116177.090018] ata8: link is slow to respond, please be patient (ready=-19) [116181.590010] ata8: SRST failed (errno=-16) [116181.590033] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116181.590042] ata8: link online but device misclassified, retrying [116187.140011] ata8: link is slow to respond, please be patient (ready=-19) [116191.640010] ata8: SRST failed (errno=-16) [116191.640034] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116191.640043] ata8: link online but device misclassified, retrying [116197.190010] ata8: link is slow to respond, please be patient (ready=-19) [116226.650011] ata8: SRST failed (errno=-16) [116226.650034] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [116226.650044] ata8: link online but device misclassified, retrying [116226.650048] ata8: limiting SATA link speed to 1.5 Gbps [116231.660075] ata8: SRST failed (errno=-16) [116231.660098] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [116231.660107] ata8: link online but device misclassified, device detection might fail I tried modprobe -r and modprobing again but with the same result. Best regards, Peter ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo ` (2 preceding siblings ...) 2008-11-25 13:00 ` Peter Favrholdt @ 2008-11-25 17:27 ` Jeff Garzik 2008-11-25 21:17 ` Mikael Pettersson 2008-11-29 21:50 ` Mikael Pettersson 3 siblings, 2 replies; 31+ messages in thread From: Jeff Garzik @ 2008-11-25 17:27 UTC (permalink / raw) To: Tejun Heo, Mikael Pettersson; +Cc: Peter Favrholdt, linux-ide Tejun Heo wrote: > sata_promise hardreset doesn't seem to be able to acquire the initial > D2H Reg FIS after hardreset leading to hardreset timeouts. Request > follow-up SRST. > > http://article.gmane.org/gmane.linux.ide/36186 > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > Peter, can you please test this one too? It's essentially the same > code just slightly prettier. Mikael, what do you think about this? > > Jeff, please commit only after Peter's verification and Mikael's ACK. > > Thanks. > > drivers/ata/sata_promise.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c > index ba9a257..917038a 100644 > --- a/drivers/ata/sata_promise.c > +++ b/drivers/ata/sata_promise.c > @@ -710,7 +710,12 @@ static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, > unsigned long deadline) > { > pdc_reset_port(link->ap); > - return sata_sff_hardreset(link, class, deadline); > + > + /* sata_promise can't reliably acquire the first D2H Reg FIS > + * after hardreset. Do non-waiting hardreset and request > + * follow-up SRST. > + */ > + return sata_std_hardreset(link, class, deadline); hrm.... at this point we have deviated massively from the standard Promise method of hard reset... * set PMP port * make hotplug irqs * reset port# in PDC_GLOBAL_CTL * pdc_reset_port() * reset FPDMA -- probably a good idea even if not doing FPDMA * clear SATA Error register (0xffffffff) * clear errors-reported-from-link-layer register * wait standard length of time for SATA connection * clear hotplug bits * set PMP port The PDC_GLOBAL_CTL bitbang is the most notable missing element in the hard reset path, though we also miss clearing an apparently-important error register as well. FPDMA reset would be a good idea I think, even if not in use, mainly for paranoia's sake and because that's what Promise's driver does. Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-25 17:27 ` Jeff Garzik @ 2008-11-25 21:17 ` Mikael Pettersson 2008-11-29 21:50 ` Mikael Pettersson 1 sibling, 0 replies; 31+ messages in thread From: Mikael Pettersson @ 2008-11-25 21:17 UTC (permalink / raw) To: Jeff Garzik; +Cc: Tejun Heo, Mikael Pettersson, Peter Favrholdt, linux-ide Jeff Garzik writes: > hrm.... at this point we have deviated massively from the standard > Promise method of hard reset... > > * set PMP port > * make hotplug irqs > * reset port# in PDC_GLOBAL_CTL > * pdc_reset_port() > * reset FPDMA -- probably a good idea even if not doing FPDMA > * clear SATA Error register (0xffffffff) > * clear errors-reported-from-link-layer register > * wait standard length of time for SATA connection > * clear hotplug bits > * set PMP port > > The PDC_GLOBAL_CTL bitbang is the most notable missing element in the > hard reset path, though we also miss clearing an apparently-important > error register as well. FPDMA reset would be a good idea I think, even > if not in use, mainly for paranoia's sake and because that's what > Promise's driver does. ->freeze() masks hotplug irqs, though it seems the events are still logged and wrongly interpreted by the irq handler. The other Promise bits should be easy to do, though FPDMA reset will need special-casing for 1st-vs-2nd generation and maybe also sata-vs-pata. I'll look into it. /Mikael ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-25 17:27 ` Jeff Garzik 2008-11-25 21:17 ` Mikael Pettersson @ 2008-11-29 21:50 ` Mikael Pettersson 2008-11-30 15:06 ` Peter Favrholdt 2009-02-10 4:30 ` Jeff Garzik 1 sibling, 2 replies; 31+ messages in thread From: Mikael Pettersson @ 2008-11-29 21:50 UTC (permalink / raw) To: Jeff Garzik; +Cc: Tejun Heo, Mikael Pettersson, Peter Favrholdt, linux-ide Jeff Garzik writes: > hrm.... at this point we have deviated massively from the standard > Promise method of hard reset... > > * set PMP port > * make hotplug irqs > * reset port# in PDC_GLOBAL_CTL > * pdc_reset_port() > * reset FPDMA -- probably a good idea even if not doing FPDMA > * clear SATA Error register (0xffffffff) > * clear errors-reported-from-link-layer register > * wait standard length of time for SATA connection > * clear hotplug bits > * set PMP port > > The PDC_GLOBAL_CTL bitbang is the most notable missing element in the I assume you meant the PCI control register (offset 0x48). > hard reset path, though we also miss clearing an apparently-important > error register as well. FPDMA reset would be a good idea I think, even > if not in use, mainly for paranoia's sake and because that's what > Promise's driver does. Here's a patch on top of 2.6.28-rc6 that should make sata_promise's reset sequences a closer match to what Promise's drivers do. - soft reset (pdc_reset_port): * wait for ATA engine to not be in packet command mode (2nd gen only) * write reset bit in PDC_CTLSTAT before the first read in the loop * for 2nd gen SATA follow up with FPDMA reset and clearing error status registers - hard reset (pdc_sata_hardreset): * wait for ATA engine to not be in packet command mode (2nd gen only) * reset ATA engine via the PCI control register * Tejun's change to use non-waiting hardreset + follow-up SRST I'm not changing the hotplug mask bits since they are taken care of by sata_promise's ->freeze() and ->thaw() operations. And I'm not writing the PMP port # because that's always zero (for now). Tested here on a SATAII150 TX2plus w/ two sata disks, some hdparm -y + smartctl tests which used to cause timeouts on one of those disks, and some parallel dd:s to stress things a little. No issues found. I'll test it on a 1st gen 20378 next week. Testers welcome. /Mikael --- linux-2.6.28-rc6/drivers/ata/sata_promise.c.~1~ 2008-11-29 16:05:04.000000000 +0100 +++ linux-2.6.28-rc6/drivers/ata/sata_promise.c 2008-11-29 20:26:26.000000000 +0100 @@ -56,6 +56,7 @@ enum { /* host register offsets (from host->iomap[PDC_MMIO_BAR]) */ PDC_INT_SEQMASK = 0x40, /* Mask of asserted SEQ INTs */ PDC_FLASH_CTL = 0x44, /* Flash control register */ + PDC_PCI_CTL = 0x48, /* PCI control/status reg */ PDC_SATA_PLUG_CSR = 0x6C, /* SATA Plug control/status reg */ PDC2_SATA_PLUG_CSR = 0x60, /* SATAII Plug control/status reg */ PDC_TBG_MODE = 0x41C, /* TBG mode (not SATAII) */ @@ -75,7 +76,17 @@ enum { PDC_CTLSTAT = 0x60, /* IDE control and status (per port) */ /* per-port SATA register offsets (from ap->ioaddr.scr_addr) */ + PDC_SATA_ERROR = 0x04, PDC_PHYMODE4 = 0x14, + PDC_LINK_LAYER_ERRORS = 0x6C, + PDC_FPDMA_CTLSTAT = 0xD8, + PDC_INTERNAL_DEBUG_1 = 0xF8, /* also used for PATA */ + PDC_INTERNAL_DEBUG_2 = 0xFC, /* also used for PATA */ + + /* PDC_FPDMA_CTLSTAT bit definitions */ + PDC_FPDMA_CTLSTAT_RESET = 1 << 3, + PDC_FPDMA_CTLSTAT_DMASETUP_INT_FLAG = 1 << 10, + PDC_FPDMA_CTLSTAT_SETDB_INT_FLAG = 1 << 11, /* PDC_GLOBAL_CTL bit definitions */ PDC_PH_ERR = (1 << 8), /* PCI error while loading packet */ @@ -354,12 +365,76 @@ static int pdc_sata_port_start(struct at return 0; } +static void pdc_fpdma_clear_interrupt_flag(struct ata_port *ap) +{ + void __iomem *sata_mmio = ap->ioaddr.scr_addr; + u32 tmp; + + tmp = readl(sata_mmio + PDC_FPDMA_CTLSTAT); + tmp |= PDC_FPDMA_CTLSTAT_DMASETUP_INT_FLAG; + tmp |= PDC_FPDMA_CTLSTAT_SETDB_INT_FLAG; + + /* It's not allowed to write to the entire FPDMA_CTLSTAT register + when NCQ is running. So do a byte-sized write to bits 10 and 11. */ + writeb(tmp >> 8, sata_mmio + PDC_FPDMA_CTLSTAT + 1); + readb(sata_mmio + PDC_FPDMA_CTLSTAT + 1); /* flush */ +} + +static void pdc_fpdma_reset(struct ata_port *ap) +{ + void __iomem *sata_mmio = ap->ioaddr.scr_addr; + u8 tmp; + + tmp = (u8)readl(sata_mmio + PDC_FPDMA_CTLSTAT); + tmp &= 0x7F; + tmp |= PDC_FPDMA_CTLSTAT_RESET; + writeb(tmp, sata_mmio + PDC_FPDMA_CTLSTAT); + readl(sata_mmio + PDC_FPDMA_CTLSTAT); /* flush */ + udelay(100); + tmp &= ~PDC_FPDMA_CTLSTAT_RESET; + writeb(tmp, sata_mmio + PDC_FPDMA_CTLSTAT); + readl(sata_mmio + PDC_FPDMA_CTLSTAT); /* flush */ + + pdc_fpdma_clear_interrupt_flag(ap); +} + +static void pdc_not_at_command_packet_phase(struct ata_port *ap) +{ + void __iomem *sata_mmio = ap->ioaddr.scr_addr; + unsigned int i; + u32 tmp; + + /* check not at ASIC packet command phase */ + for (i = 0; i < 100; ++i) { + writel(0, sata_mmio + PDC_INTERNAL_DEBUG_1); + tmp = readl(sata_mmio + PDC_INTERNAL_DEBUG_2); + if ((tmp & 0xF) != 1) + break; + udelay(100); + } +} + +static void pdc_clear_internal_debug_record_error_register(struct ata_port *ap) +{ + void __iomem *sata_mmio = ap->ioaddr.scr_addr; + + writel(0xffffffff, sata_mmio + PDC_SATA_ERROR); + writel(0xffff0000, sata_mmio + PDC_LINK_LAYER_ERRORS); +} + static void pdc_reset_port(struct ata_port *ap) { void __iomem *ata_ctlstat_mmio = ap->ioaddr.cmd_addr + PDC_CTLSTAT; unsigned int i; u32 tmp; + if (ap->flags & PDC_FLAG_GEN_II) + pdc_not_at_command_packet_phase(ap); + + tmp = readl(ata_ctlstat_mmio); + tmp |= PDC_RESET; + writel(tmp, ata_ctlstat_mmio); + for (i = 11; i > 0; i--) { tmp = readl(ata_ctlstat_mmio); if (tmp & PDC_RESET) @@ -374,6 +449,11 @@ static void pdc_reset_port(struct ata_po tmp &= ~PDC_RESET; writel(tmp, ata_ctlstat_mmio); readl(ata_ctlstat_mmio); /* flush */ + + if (sata_scr_valid(&ap->link) && (ap->flags & PDC_FLAG_GEN_II)) { + pdc_fpdma_reset(ap); + pdc_clear_internal_debug_record_error_register(ap); + } } static int pdc_pata_cable_detect(struct ata_port *ap) @@ -706,11 +786,50 @@ static int pdc_pata_softreset(struct ata return ata_sff_softreset(link, class, deadline); } +static unsigned int pdc_ata_port_to_ata_no(const struct ata_port *ap) +{ + void __iomem *ata_mmio = ap->ioaddr.cmd_addr; + void __iomem *host_mmio = ap->host->iomap[PDC_MMIO_BAR]; + + /* ata_mmio == host_mmio + 0x200 + ata_no * 0x80 */ + return (ata_mmio - host_mmio - 0x200) / 0x80; +} + +static void pdc_hard_reset_port(struct ata_port *ap) +{ + void __iomem *host_mmio = ap->host->iomap[PDC_MMIO_BAR]; + void __iomem *pcictl_b1_mmio = host_mmio + PDC_PCI_CTL + 1; + unsigned int ata_no = pdc_ata_port_to_ata_no(ap); + u8 tmp; + + spin_lock(&ap->host->lock); + + tmp = readb(pcictl_b1_mmio); + tmp &= ~(0x10 << ata_no); + writeb(tmp, pcictl_b1_mmio); + readb(pcictl_b1_mmio); /* flush */ + udelay(100); + tmp |= (0x10 << ata_no); + writeb(tmp, pcictl_b1_mmio); + readb(pcictl_b1_mmio); /* flush */ + + spin_unlock(&ap->host->lock); +} + static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, unsigned long deadline) { + if (link->ap->flags & PDC_FLAG_GEN_II) + pdc_not_at_command_packet_phase(link->ap); + /* hotplug IRQs should have been masked by pdc_sata_freeze() */ + pdc_hard_reset_port(link->ap); pdc_reset_port(link->ap); - return sata_sff_hardreset(link, class, deadline); + + /* sata_promise can't reliably acquire the first D2H Reg FIS + * after hardreset. Do non-waiting hardreset and request + * follow-up SRST. + */ + return sata_std_hardreset(link, class, deadline); } static void pdc_error_handler(struct ata_port *ap) ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-29 21:50 ` Mikael Pettersson @ 2008-11-30 15:06 ` Peter Favrholdt 2009-02-10 4:30 ` Jeff Garzik 1 sibling, 0 replies; 31+ messages in thread From: Peter Favrholdt @ 2008-11-30 15:06 UTC (permalink / raw) To: linux-ide; +Cc: Mikael Pettersson, Jeff Garzik, Tejun Heo Hi Mikael and list, Mikael Pettersson wrote: > Here's a patch on top of 2.6.28-rc6 that should make sata_promise's > reset sequences a closer match to what Promise's drivers do. > ...snip... > Testers welcome. I don't have physical access to my system at the moment, and some channels/drives had already failed during previous testing. Without a proper power-cycle the system did not recover the failed channels when rebooted into 2.6.28-rc6 with your patch :-( Best regards, Peter > --- linux-2.6.28-rc6/drivers/ata/sata_promise.c.~1~ 2008-11-29 16:05:04.000000000 +0100 > +++ linux-2.6.28-rc6/drivers/ata/sata_promise.c 2008-11-29 20:26:26.000000000 +0100 > @@ -56,6 +56,7 @@ enum { > /* host register offsets (from host->iomap[PDC_MMIO_BAR]) */ > PDC_INT_SEQMASK = 0x40, /* Mask of asserted SEQ INTs */ > PDC_FLASH_CTL = 0x44, /* Flash control register */ > + PDC_PCI_CTL = 0x48, /* PCI control/status reg */ > PDC_SATA_PLUG_CSR = 0x6C, /* SATA Plug control/status reg */ > PDC2_SATA_PLUG_CSR = 0x60, /* SATAII Plug control/status reg */ > PDC_TBG_MODE = 0x41C, /* TBG mode (not SATAII) */ > @@ -75,7 +76,17 @@ enum { > PDC_CTLSTAT = 0x60, /* IDE control and status (per port) */ > > /* per-port SATA register offsets (from ap->ioaddr.scr_addr) */ > + PDC_SATA_ERROR = 0x04, > PDC_PHYMODE4 = 0x14, > + PDC_LINK_LAYER_ERRORS = 0x6C, > + PDC_FPDMA_CTLSTAT = 0xD8, > + PDC_INTERNAL_DEBUG_1 = 0xF8, /* also used for PATA */ > + PDC_INTERNAL_DEBUG_2 = 0xFC, /* also used for PATA */ > + > + /* PDC_FPDMA_CTLSTAT bit definitions */ > + PDC_FPDMA_CTLSTAT_RESET = 1 << 3, > + PDC_FPDMA_CTLSTAT_DMASETUP_INT_FLAG = 1 << 10, > + PDC_FPDMA_CTLSTAT_SETDB_INT_FLAG = 1 << 11, > > /* PDC_GLOBAL_CTL bit definitions */ > PDC_PH_ERR = (1 << 8), /* PCI error while loading packet */ > @@ -354,12 +365,76 @@ static int pdc_sata_port_start(struct at > return 0; > } > > +static void pdc_fpdma_clear_interrupt_flag(struct ata_port *ap) > +{ > + void __iomem *sata_mmio = ap->ioaddr.scr_addr; > + u32 tmp; > + > + tmp = readl(sata_mmio + PDC_FPDMA_CTLSTAT); > + tmp |= PDC_FPDMA_CTLSTAT_DMASETUP_INT_FLAG; > + tmp |= PDC_FPDMA_CTLSTAT_SETDB_INT_FLAG; > + > + /* It's not allowed to write to the entire FPDMA_CTLSTAT register > + when NCQ is running. So do a byte-sized write to bits 10 and 11. */ > + writeb(tmp >> 8, sata_mmio + PDC_FPDMA_CTLSTAT + 1); > + readb(sata_mmio + PDC_FPDMA_CTLSTAT + 1); /* flush */ > +} > + > +static void pdc_fpdma_reset(struct ata_port *ap) > +{ > + void __iomem *sata_mmio = ap->ioaddr.scr_addr; > + u8 tmp; > + > + tmp = (u8)readl(sata_mmio + PDC_FPDMA_CTLSTAT); > + tmp &= 0x7F; > + tmp |= PDC_FPDMA_CTLSTAT_RESET; > + writeb(tmp, sata_mmio + PDC_FPDMA_CTLSTAT); > + readl(sata_mmio + PDC_FPDMA_CTLSTAT); /* flush */ > + udelay(100); > + tmp &= ~PDC_FPDMA_CTLSTAT_RESET; > + writeb(tmp, sata_mmio + PDC_FPDMA_CTLSTAT); > + readl(sata_mmio + PDC_FPDMA_CTLSTAT); /* flush */ > + > + pdc_fpdma_clear_interrupt_flag(ap); > +} > + > +static void pdc_not_at_command_packet_phase(struct ata_port *ap) > +{ > + void __iomem *sata_mmio = ap->ioaddr.scr_addr; > + unsigned int i; > + u32 tmp; > + > + /* check not at ASIC packet command phase */ > + for (i = 0; i < 100; ++i) { > + writel(0, sata_mmio + PDC_INTERNAL_DEBUG_1); > + tmp = readl(sata_mmio + PDC_INTERNAL_DEBUG_2); > + if ((tmp & 0xF) != 1) > + break; > + udelay(100); > + } > +} > + > +static void pdc_clear_internal_debug_record_error_register(struct ata_port *ap) > +{ > + void __iomem *sata_mmio = ap->ioaddr.scr_addr; > + > + writel(0xffffffff, sata_mmio + PDC_SATA_ERROR); > + writel(0xffff0000, sata_mmio + PDC_LINK_LAYER_ERRORS); > +} > + > static void pdc_reset_port(struct ata_port *ap) > { > void __iomem *ata_ctlstat_mmio = ap->ioaddr.cmd_addr + PDC_CTLSTAT; > unsigned int i; > u32 tmp; > > + if (ap->flags & PDC_FLAG_GEN_II) > + pdc_not_at_command_packet_phase(ap); > + > + tmp = readl(ata_ctlstat_mmio); > + tmp |= PDC_RESET; > + writel(tmp, ata_ctlstat_mmio); > + > for (i = 11; i > 0; i--) { > tmp = readl(ata_ctlstat_mmio); > if (tmp & PDC_RESET) > @@ -374,6 +449,11 @@ static void pdc_reset_port(struct ata_po > tmp &= ~PDC_RESET; > writel(tmp, ata_ctlstat_mmio); > readl(ata_ctlstat_mmio); /* flush */ > + > + if (sata_scr_valid(&ap->link) && (ap->flags & PDC_FLAG_GEN_II)) { > + pdc_fpdma_reset(ap); > + pdc_clear_internal_debug_record_error_register(ap); > + } > } > > static int pdc_pata_cable_detect(struct ata_port *ap) > @@ -706,11 +786,50 @@ static int pdc_pata_softreset(struct ata > return ata_sff_softreset(link, class, deadline); > } > > +static unsigned int pdc_ata_port_to_ata_no(const struct ata_port *ap) > +{ > + void __iomem *ata_mmio = ap->ioaddr.cmd_addr; > + void __iomem *host_mmio = ap->host->iomap[PDC_MMIO_BAR]; > + > + /* ata_mmio == host_mmio + 0x200 + ata_no * 0x80 */ > + return (ata_mmio - host_mmio - 0x200) / 0x80; > +} > + > +static void pdc_hard_reset_port(struct ata_port *ap) > +{ > + void __iomem *host_mmio = ap->host->iomap[PDC_MMIO_BAR]; > + void __iomem *pcictl_b1_mmio = host_mmio + PDC_PCI_CTL + 1; > + unsigned int ata_no = pdc_ata_port_to_ata_no(ap); > + u8 tmp; > + > + spin_lock(&ap->host->lock); > + > + tmp = readb(pcictl_b1_mmio); > + tmp &= ~(0x10 << ata_no); > + writeb(tmp, pcictl_b1_mmio); > + readb(pcictl_b1_mmio); /* flush */ > + udelay(100); > + tmp |= (0x10 << ata_no); > + writeb(tmp, pcictl_b1_mmio); > + readb(pcictl_b1_mmio); /* flush */ > + > + spin_unlock(&ap->host->lock); > +} > + > static int pdc_sata_hardreset(struct ata_link *link, unsigned int *class, > unsigned long deadline) > { > + if (link->ap->flags & PDC_FLAG_GEN_II) > + pdc_not_at_command_packet_phase(link->ap); > + /* hotplug IRQs should have been masked by pdc_sata_freeze() */ > + pdc_hard_reset_port(link->ap); > pdc_reset_port(link->ap); > - return sata_sff_hardreset(link, class, deadline); > + > + /* sata_promise can't reliably acquire the first D2H Reg FIS > + * after hardreset. Do non-waiting hardreset and request > + * follow-up SRST. > + */ > + return sata_std_hardreset(link, class, deadline); > } > > static void pdc_error_handler(struct ata_port *ap) ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2008-11-29 21:50 ` Mikael Pettersson 2008-11-30 15:06 ` Peter Favrholdt @ 2009-02-10 4:30 ` Jeff Garzik 2009-02-10 17:28 ` Mikael Pettersson 1 sibling, 1 reply; 31+ messages in thread From: Jeff Garzik @ 2009-02-10 4:30 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Tejun Heo, Peter Favrholdt, linux-ide Mikael Pettersson wrote: > Jeff Garzik writes: > > hrm.... at this point we have deviated massively from the standard > > Promise method of hard reset... > > > > * set PMP port > > * make hotplug irqs > > * reset port# in PDC_GLOBAL_CTL > > * pdc_reset_port() > > * reset FPDMA -- probably a good idea even if not doing FPDMA > > * clear SATA Error register (0xffffffff) > > * clear errors-reported-from-link-layer register > > * wait standard length of time for SATA connection > > * clear hotplug bits > > * set PMP port > > > > The PDC_GLOBAL_CTL bitbang is the most notable missing element in the > > I assume you meant the PCI control register (offset 0x48). > > > hard reset path, though we also miss clearing an apparently-important > > error register as well. FPDMA reset would be a good idea I think, even > > if not in use, mainly for paranoia's sake and because that's what > > Promise's driver does. > > Here's a patch on top of 2.6.28-rc6 that should make sata_promise's > reset sequences a closer match to what Promise's drivers do. > > - soft reset (pdc_reset_port): > * wait for ATA engine to not be in packet command mode (2nd gen only) > * write reset bit in PDC_CTLSTAT before the first read in the loop > * for 2nd gen SATA follow up with FPDMA reset and clearing error status registers > - hard reset (pdc_sata_hardreset): > * wait for ATA engine to not be in packet command mode (2nd gen only) > * reset ATA engine via the PCI control register > * Tejun's change to use non-waiting hardreset + follow-up SRST > > I'm not changing the hotplug mask bits since they are taken care > of by sata_promise's ->freeze() and ->thaw() operations. And I'm > not writing the PMP port # because that's always zero (for now). > > Tested here on a SATAII150 TX2plus w/ two sata disks, some hdparm -y > + smartctl tests which used to cause timeouts on one of those disks, > and some parallel dd:s to stress things a little. No issues found. > I'll test it on a 1st gen 20378 next week. > > Testers welcome. Any updates? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2009-02-10 4:30 ` Jeff Garzik @ 2009-02-10 17:28 ` Mikael Pettersson 2009-02-10 21:13 ` Jeff Garzik 0 siblings, 1 reply; 31+ messages in thread From: Mikael Pettersson @ 2009-02-10 17:28 UTC (permalink / raw) To: Jeff Garzik; +Cc: Mikael Pettersson, Tejun Heo, Peter Favrholdt, linux-ide Jeff Garzik writes: > Mikael Pettersson wrote: > > Jeff Garzik writes: > > > hrm.... at this point we have deviated massively from the standard > > > Promise method of hard reset... > > > > > > * set PMP port > > > * make hotplug irqs > > > * reset port# in PDC_GLOBAL_CTL > > > * pdc_reset_port() > > > * reset FPDMA -- probably a good idea even if not doing FPDMA > > > * clear SATA Error register (0xffffffff) > > > * clear errors-reported-from-link-layer register > > > * wait standard length of time for SATA connection > > > * clear hotplug bits > > > * set PMP port > > > > > > The PDC_GLOBAL_CTL bitbang is the most notable missing element in the > > > > I assume you meant the PCI control register (offset 0x48). > > > > > hard reset path, though we also miss clearing an apparently-important > > > error register as well. FPDMA reset would be a good idea I think, even > > > if not in use, mainly for paranoia's sake and because that's what > > > Promise's driver does. > > > > Here's a patch on top of 2.6.28-rc6 that should make sata_promise's > > reset sequences a closer match to what Promise's drivers do. > > > > - soft reset (pdc_reset_port): > > * wait for ATA engine to not be in packet command mode (2nd gen only) > > * write reset bit in PDC_CTLSTAT before the first read in the loop > > * for 2nd gen SATA follow up with FPDMA reset and clearing error status registers > > - hard reset (pdc_sata_hardreset): > > * wait for ATA engine to not be in packet command mode (2nd gen only) > > * reset ATA engine via the PCI control register > > * Tejun's change to use non-waiting hardreset + follow-up SRST > > > > I'm not changing the hotplug mask bits since they are taken care > > of by sata_promise's ->freeze() and ->thaw() operations. And I'm > > not writing the PMP port # because that's always zero (for now). > > > > Tested here on a SATAII150 TX2plus w/ two sata disks, some hdparm -y > > + smartctl tests which used to cause timeouts on one of those disks, > > and some parallel dd:s to stress things a little. No issues found. > > I'll test it on a 1st gen 20378 next week. > > > > Testers welcome. > > Any updates? Not really. I've kept the patch up to date for newer kernels, but there's been no testing by others that I know of. Do you want to include it in a testing branch? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST 2009-02-10 17:28 ` Mikael Pettersson @ 2009-02-10 21:13 ` Jeff Garzik 2009-02-23 12:17 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST - it works Peter Favrholdt 0 siblings, 1 reply; 31+ messages in thread From: Jeff Garzik @ 2009-02-10 21:13 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Tejun Heo, Peter Favrholdt, linux-ide Mikael Pettersson wrote: > Jeff Garzik writes: > > Mikael Pettersson wrote: > > > Jeff Garzik writes: > > > > hrm.... at this point we have deviated massively from the standard > > > > Promise method of hard reset... > > > > > > > > * set PMP port > > > > * make hotplug irqs > > > > * reset port# in PDC_GLOBAL_CTL > > > > * pdc_reset_port() > > > > * reset FPDMA -- probably a good idea even if not doing FPDMA > > > > * clear SATA Error register (0xffffffff) > > > > * clear errors-reported-from-link-layer register > > > > * wait standard length of time for SATA connection > > > > * clear hotplug bits > > > > * set PMP port > > > > > > > > The PDC_GLOBAL_CTL bitbang is the most notable missing element in the > > > > > > I assume you meant the PCI control register (offset 0x48). > > > > > > > hard reset path, though we also miss clearing an apparently-important > > > > error register as well. FPDMA reset would be a good idea I think, even > > > > if not in use, mainly for paranoia's sake and because that's what > > > > Promise's driver does. > > > > > > Here's a patch on top of 2.6.28-rc6 that should make sata_promise's > > > reset sequences a closer match to what Promise's drivers do. > > > > > > - soft reset (pdc_reset_port): > > > * wait for ATA engine to not be in packet command mode (2nd gen only) > > > * write reset bit in PDC_CTLSTAT before the first read in the loop > > > * for 2nd gen SATA follow up with FPDMA reset and clearing error status registers > > > - hard reset (pdc_sata_hardreset): > > > * wait for ATA engine to not be in packet command mode (2nd gen only) > > > * reset ATA engine via the PCI control register > > > * Tejun's change to use non-waiting hardreset + follow-up SRST > > > > > > I'm not changing the hotplug mask bits since they are taken care > > > of by sata_promise's ->freeze() and ->thaw() operations. And I'm > > > not writing the PMP port # because that's always zero (for now). > > > > > > Tested here on a SATAII150 TX2plus w/ two sata disks, some hdparm -y > > > + smartctl tests which used to cause timeouts on one of those disks, > > > and some parallel dd:s to stress things a little. No issues found. > > > I'll test it on a 1st gen 20378 next week. > > > > > > Testers welcome. > > > > Any updates? > > Not really. I've kept the patch up to date for newer kernels, but > there's been no testing by others that I know of. Do you want to > include it in a testing branch? I can at least make sure it is in #ALL and #NEXT, two meta-branches in libata-dev.git that are forwarded to akpm's -mm tree and linux-next, respectively, for wider public testing. We can give it more-public testing and hopefully get a bit more feedback. And you can always post to linux-kernel and linux-ide requesting testers... that's something even non-programmers can do. Something like "Request for SATA Promise testing" would surely get at least a few responses... Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH #upstream-fixes] sata_promise: request follow-up SRST - it works 2009-02-10 21:13 ` Jeff Garzik @ 2009-02-23 12:17 ` Peter Favrholdt 0 siblings, 0 replies; 31+ messages in thread From: Peter Favrholdt @ 2009-02-23 12:17 UTC (permalink / raw) To: linux-ide, Jeff Garzik, Mikael Pettersson Jeff Garzik wrote: > Mikael Pettersson wrote: >> Not really. I've kept the patch up to date for newer kernels, but >> there's been no testing by others that I know of. Do you want to >> include it in a testing branch? > > I can at least make sure it is in #ALL and #NEXT, two meta-branches in > libata-dev.git that are forwarded to akpm's -mm tree and linux-next, > respectively, for wider public testing. > > We can give it more-public testing and hopefully get a bit more feedback. I have tested this patch: http://user.it.uu.se/~mikpe/linux/patches/2.6/OLD/2.6.28/patch-sata_promise-reset-updates-v1-2.6.28 On a vanilla 2.6.28.6 kernel. It works without any problems. Unfortunately I had to replace a faulty power supply - which may have been the cause for my system failing earlier (as reported in previous postings to this list). Hardware: AMD Barton 2500 on Nforce2 motherboard with Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) 4 Seagate 500GB ES drives: Model Number: ST3500630NS Firmware Revision: 3.AEE (with 1.5/3.0Gbps jumper removed = 3.0Gbps) Best regards Peter ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2009-02-23 12:17 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-13 21:21 FYI: BUG in SATA Promise 300 TX4 (2.6.24 - 2.6.27-3) w/Linux Linda Walsh 2008-11-16 6:04 ` Tejun Heo 2008-11-16 11:08 ` Mikael Pettersson 2008-11-16 14:24 ` Tejun Heo 2008-11-16 16:48 ` Brad Campbell 2008-11-17 2:01 ` Tejun Heo 2008-11-16 17:34 ` Peter Favrholdt 2008-11-16 17:39 ` Peter Favrholdt 2008-11-17 2:01 ` Tejun Heo 2008-11-17 11:47 ` Peter Favrholdt 2008-11-18 1:11 ` Tejun Heo 2008-11-18 18:03 ` Peter Favrholdt 2008-11-19 1:55 ` Tejun Heo 2008-11-20 10:22 ` Peter Favrholdt 2008-11-20 11:10 ` Mikael Pettersson 2008-11-21 4:42 ` Tejun Heo 2008-11-21 4:56 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST Tejun Heo 2008-11-22 16:30 ` Mikael Pettersson 2008-11-23 22:38 ` Peter Favrholdt 2008-11-25 13:00 ` Peter Favrholdt 2008-11-26 2:46 ` Tejun Heo 2008-11-26 8:12 ` Peter Favrholdt 2008-11-26 23:07 ` Peter Favrholdt 2008-11-25 17:27 ` Jeff Garzik 2008-11-25 21:17 ` Mikael Pettersson 2008-11-29 21:50 ` Mikael Pettersson 2008-11-30 15:06 ` Peter Favrholdt 2009-02-10 4:30 ` Jeff Garzik 2009-02-10 17:28 ` Mikael Pettersson 2009-02-10 21:13 ` Jeff Garzik 2009-02-23 12:17 ` [PATCH #upstream-fixes] sata_promise: request follow-up SRST - it works Peter Favrholdt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).