* Re: libata interface fatal error @ 2007-05-26 9:43 Florian Effenberger 2007-05-29 9:16 ` Tejun Heo 0 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-05-26 9:43 UTC (permalink / raw) To: linux-ide; +Cc: htejun, jeff Hi, it seems that the speed is never lowered, I always see "SATA link up 3.0 Gbps (SStatus 123 SControl 300)". Can I manually lower the speed via a kernel parameter? Thanks Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-26 9:43 libata interface fatal error Florian Effenberger @ 2007-05-29 9:16 ` Tejun Heo 2007-05-29 14:16 ` Florian Effenberger 2007-06-06 21:23 ` Florian Effenberger 0 siblings, 2 replies; 41+ messages in thread From: Tejun Heo @ 2007-05-29 9:16 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: > Hi, > > it seems that the speed is never lowered, I always see "SATA link up 3.0 > Gbps (SStatus 123 SControl 300)". > > Can I manually lower the speed via a kernel parameter? Currently, there is no mechanism to do that but hard drives usually have dip switch to force 1.5Gbps. Please try that. If your harddrive doesn't have that, please lemme know. I'll prepare a simple patch. Thanks. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-29 9:16 ` Tejun Heo @ 2007-05-29 14:16 ` Florian Effenberger 2007-06-06 21:23 ` Florian Effenberger 1 sibling, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-05-29 14:16 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi Tejun, > Currently, there is no mechanism to do that but hard drives usually have > dip switch to force 1.5Gbps. Please try that. If your harddrive > doesn't have that, please lemme know. I'll prepare a simple patch. thanks a lot, I will try that out and tell you the results. :-) Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-29 9:16 ` Tejun Heo 2007-05-29 14:16 ` Florian Effenberger @ 2007-06-06 21:23 ` Florian Effenberger 2007-06-07 9:50 ` Tejun Heo 1 sibling, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-06 21:23 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi, > Currently, there is no mechanism to do that but hard drives usually have > dip switch to force 1.5Gbps. Please try that. If your harddrive > doesn't have that, please lemme know. I'll prepare a simple patch. unfortunately, the disks have a jumper board, but the jumpers are missing... could you write a patch for me? Would be much appreciated! Thanks a lot! Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-06 21:23 ` Florian Effenberger @ 2007-06-07 9:50 ` Tejun Heo 2007-06-07 14:08 ` Florian Effenberger ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Tejun Heo @ 2007-06-07 9:50 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff [-- Attachment #1: Type: text/plain, Size: 612 bytes --] Florian Effenberger wrote: > Hi, > >> Currently, there is no mechanism to do that but hard drives usually have >> dip switch to force 1.5Gbps. Please try that. If your harddrive >> doesn't have that, please lemme know. I'll prepare a simple patch. > > unfortunately, the disks have a jumper board, but the jumpers are > missing... could you write a patch for me? Would be much appreciated! Okay, there was a bug in link speed limit logic. That's probably why speed down to 1.5Gbps didn't kick in. The attached patch contains the fix and hack to force 1.5Gbps. Please give it a shot. Thanks. -- tejun [-- Attachment #2: ahci-force-1_5.patch --] [-- Type: text/x-patch, Size: 2156 bytes --] diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 7baeaff..f9550f1 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -219,6 +219,7 @@ static int ahci_init_one (struct pci_dev *pdev, const struct pci_device_id *ent) static unsigned int ahci_qc_issue(struct ata_queued_cmd *qc); static void ahci_irq_clear(struct ata_port *ap); static int ahci_port_start(struct ata_port *ap); +static int ahci_vt8251_port_start(struct ata_port *ap); static void ahci_port_stop(struct ata_port *ap); static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf); static void ahci_qc_prep(struct ata_queued_cmd *qc); @@ -284,7 +285,7 @@ static const struct ata_port_operations ahci_ops = { .port_resume = ahci_port_resume, #endif - .port_start = ahci_port_start, + .port_start = ahci_vt8251_port_start, .port_stop = ahci_port_stop, }; @@ -318,7 +319,7 @@ static const struct ata_port_operations ahci_vt8251_ops = { .port_resume = ahci_port_resume, #endif - .port_start = ahci_port_start, + .port_start = ahci_vt8251_port_start, .port_stop = ahci_port_stop, }; @@ -1558,6 +1559,19 @@ static int ahci_port_start(struct ata_port *ap) return 0; } +static int ahci_vt8251_port_start(struct ata_port *ap) +{ + struct ahci_host_priv *hpriv = ap->host->private_data; + + if (((hpriv->cap >> 20) & 0xf) != 1) { + printk("limiting SATA link speed to 1.5Gbps\n"); + ap->hw_sata_spd_limit = 1; + ap->eh_info.action |= ATA_EH_HARDRESET; + } + + return ahci_port_start(ap); +} + static void ahci_port_stop(struct ata_port *ap) { const char *emsg = NULL; diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 4733f00..57940ba 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6313,7 +6313,8 @@ int ata_host_register(struct ata_host *host, struct scsi_host_template *sht) /* init sata_spd_limit to the current value */ if (sata_scr_read(ap, SCR_CONTROL, &scontrol) == 0) { int spd = (scontrol >> 4) & 0xf; - ap->hw_sata_spd_limit &= (1 << spd) - 1; + if (spd) + ap->hw_sata_spd_limit &= (1 << spd) - 1; } ap->sata_spd_limit = ap->hw_sata_spd_limit; ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-07 9:50 ` Tejun Heo @ 2007-06-07 14:08 ` Florian Effenberger 2007-06-13 10:37 ` Florian Effenberger 2007-06-16 10:23 ` Florian Effenberger 2 siblings, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-06-07 14:08 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi Tejun, > Okay, there was a bug in link speed limit logic. That's probably why > speed down to 1.5Gbps didn't kick in. The attached patch contains the > fix and hack to force 1.5Gbps. Please give it a shot. thanks a lot for your help, much appreciated! Will test it and let you know if it works. Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-07 9:50 ` Tejun Heo 2007-06-07 14:08 ` Florian Effenberger @ 2007-06-13 10:37 ` Florian Effenberger 2007-06-14 9:43 ` Tejun Heo 2007-06-16 10:23 ` Florian Effenberger 2 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-13 10:37 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi Tejun, > Okay, there was a bug in link speed limit logic. That's probably why > speed down to 1.5Gbps didn't kick in. The attached patch contains the > fix and hack to force 1.5Gbps. Please give it a shot. thanks a lot for your patch, it seems to work, at least better than without patch. :-) When rsyncing about 12 GB, no trouble occured. When doing heavy stress tests, I receive errors again, but okay, maybe that's due to a hardware bug. Will your patch go into the vanilla kernel? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-13 10:37 ` Florian Effenberger @ 2007-06-14 9:43 ` Tejun Heo 2007-06-14 11:12 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-06-14 9:43 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: > Hi Tejun, > >> Okay, there was a bug in link speed limit logic. That's probably why >> speed down to 1.5Gbps didn't kick in. The attached patch contains the >> fix and hack to force 1.5Gbps. Please give it a shot. > > thanks a lot for your patch, it seems to work, at least better than > without patch. :-) > > When rsyncing about 12 GB, no trouble occured. When doing heavy stress > tests, I receive errors again, but okay, maybe that's due to a hardware > bug. > > Will your patch go into the vanilla kernel? I'm currently not sure what the root cause is 1. if the controller is at fault, we need to force 1.5Gbps on the controller. 2. if the drive model is broken, we need to blacklist the drives. 3. if your specific configuration is broken (faulty hw, PSU, bad karma), the upstream speed limit fix patch should be enough. Can you post the result of 'hdparm -I /dev/sdX'? -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-14 9:43 ` Tejun Heo @ 2007-06-14 11:12 ` Florian Effenberger 2007-06-14 12:25 ` Tejun Heo 0 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-14 11:12 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff [-- Attachment #1: Type: text/plain, Size: 222 bytes --] Hi, > Can you post the result of 'hdparm -I /dev/sdX'? thanks a lot for your kind support, that is much appreciated! Attached is some machine output, hope that helps. Let me know I you need more information. Florian [-- Attachment #2: debug.txt --] [-- Type: text/plain, Size: 18541 bytes --] 00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02) 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2) 00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02) 01:00.0 VGA compatible controller: nVidia Corporation Unknown device 016a (rev a1) 03:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12) 04:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 04:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 05:01.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 0c) 05:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) /dev/md0: Version : 00.90.03 Creation Time : Tue May 1 15:56:11 2007 Raid Level : raid5 Array Size : 937713408 (894.27 GiB 960.22 GB) Device Size : 312571136 (298.09 GiB 320.07 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Jun 14 12:40:44 2007 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : 6e3c156d:c91eb028:40daae21:698c531b Events : 0.62 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 8 48 2 active sync /dev/sdd 3 8 64 3 active sync /dev/sde /dev/sda: ATA device, with non-removable media Model Number: WDC WD1600YS-01SHB1 Serial Number: WD-WCAP01819659 Firmware Revision: 20.06C06 Standards: Supported: 7 6 5 4 Likely used: 7 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 321670847 device size with M = 1024*1024: 157065 MBytes device size with M = 1000*1000: 164695 MBytes (164 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 0 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * WRITE_{DMA|MULTIPLE}_FUA_EXT * 64-bit World wide name * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase 52min for SECURITY ERASE UNIT. Checksum: correct /dev/sdb: ATA device, with non-removable media Model Number: WDC WD3200YS-01PGB0 Serial Number: WD-WCAPD3405080 Firmware Revision: 21.00M21 Standards: Supported: 7 6 5 4 Likely used: 7 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 device size with M = 1024*1024: 305245 MBytes device size with M = 1000*1000: 320072 MBytes (320 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 1 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 0 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase Checksum: correct /dev/sdc: ATA device, with non-removable media Model Number: WDC WD3200YS-01PGB0 Serial Number: WD-WCAPD4087913 Firmware Revision: 21.00M21 Standards: Supported: 7 6 5 4 Likely used: 7 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 device size with M = 1024*1024: 305245 MBytes device size with M = 1000*1000: 320072 MBytes (320 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 1 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 0 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase Checksum: correct /dev/sdd: ATA device, with non-removable media Model Number: WDC WD3200YS-01PGB0 Serial Number: WD-WCAPD4124047 Firmware Revision: 21.00M21 Standards: Supported: 7 6 5 4 Likely used: 7 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 device size with M = 1024*1024: 305245 MBytes device size with M = 1000*1000: 320072 MBytes (320 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 1 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 0 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase Checksum: correct /dev/sde: ATA device, with non-removable media Model Number: WDC WD3200YS-01PGB0 Serial Number: WD-WCAPD3406202 Firmware Revision: 21.00M21 Standards: Supported: 7 6 5 4 Likely used: 7 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 device size with M = 1024*1024: 305245 MBytes device size with M = 1000*1000: 320072 MBytes (320 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 1 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 0 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase Checksum: correct ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-14 11:12 ` Florian Effenberger @ 2007-06-14 12:25 ` Tejun Heo 2007-06-14 15:12 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-06-14 12:25 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: > Hi, > >> Can you post the result of 'hdparm -I /dev/sdX'? > > thanks a lot for your kind support, that is much appreciated! > > Attached is some machine output, hope that helps. Let me know I you need > more information. Okay, ich8. I don't think the chipset is at fault here and you have a lot of disks. My primary suspect is power supply problem but things like this are hard to prove. With the merged speed down fix, libata will do the right thing after a few errors, so ignoring the problem wouldn't be a too bad idea. If you're curious, you can try to connect drives to different SATA ports and power lanes and see whether errors follow the disk, port or power lane. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-14 12:25 ` Tejun Heo @ 2007-06-14 15:12 ` Florian Effenberger 2007-06-18 3:10 ` Tejun Heo 0 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-14 15:12 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi Tejun, > Okay, ich8. I don't think the chipset is at fault here and you have a > lot of disks. My primary suspect is power supply problem but things > like this are hard to prove. With the merged speed down fix, libata > will do the right thing after a few errors, so ignoring the problem > wouldn't be a too bad idea. If you're curious, you can try to connect > drives to different SATA ports and power lanes and see whether errors > follow the disk, port or power lane. exactly, should be four disks in the machine. What power supply would you recommend for this type of disks? I think we got a 450W Enermax, IIRC. What do you mean by "merged speed down fix"? Is your fix for the speed down logic implemented in the current kernel, so I don't have to patch anymore (except when I want to force 1.5Gbps right from the beginning)? All SATA parts are used, so reconnecting is not an option. But I can try to switch the power supply (lanes). Thanks for all your kind help, that is much appreciated! Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-14 15:12 ` Florian Effenberger @ 2007-06-18 3:10 ` Tejun Heo 2007-06-18 6:08 ` Tomi Orava 2007-06-18 10:38 ` Florian Effenberger 0 siblings, 2 replies; 41+ messages in thread From: Tejun Heo @ 2007-06-18 3:10 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Hello, Florian Effenberger wrote: > What power supply would you recommend for this type of disks? I think we > got a 450W Enermax, IIRC. Most power supplies should be able to do 4 disks without any problem unless it's broken. > What do you mean by "merged speed down fix"? Is your fix for the speed > down logic implemented in the current kernel, so I don't have to patch > anymore (except when I want to force 1.5Gbps right from the beginning)? Yeap, kernel will automatically downgrade to 1.5Gbps after several failures. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 3:10 ` Tejun Heo @ 2007-06-18 6:08 ` Tomi Orava 2007-06-18 6:28 ` Tejun Heo 2007-06-18 10:38 ` Florian Effenberger 1 sibling, 1 reply; 41+ messages in thread From: Tomi Orava @ 2007-06-18 6:08 UTC (permalink / raw) To: Tejun Heo; +Cc: Florian Effenberger, linux-ide, jeff Hi Tejun, I've been trying to find a solution for a long time for quite a similar libata errror messages as shown in this thread. Perhaps you might get have some ideas what the actual originator might be: With the latest 2.6.22-rc4-git4 kernel I still get the following error messages with high I/O load: sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 ata3.00: (port_status 0x20080000) ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation) ata3: soft resetting port ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata3.00: configured for UDMA/133 ata3: EH complete ... and later in the chain ... sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 ata3.00: (port_status 0x20080000) ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation) ata3: soft resetting port ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata3.00: configured for UDMA/100 ata3: EH complete --- This goes on until UDMA/33 has been reched The problematic hardware combination is: 00:00.0 Host bridge: VIA Technologies, Inc. KT880 Host Bridge (rev 80) 00:00.1 Host bridge: VIA Technologies, Inc. KT880 Host Bridge 00:00.2 Host bridge: VIA Technologies, Inc. KT880 Host Bridge 00:00.3 Host bridge: VIA Technologies, Inc. KT880 Host Bridge 00:00.4 Host bridge: VIA Technologies, Inc. KT880 Host Bridge 00:00.7 Host bridge: VIA Technologies, Inc. KT880 Host Bridge 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge 00:09.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13) 00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 00:0e.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) 00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South] 00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60) 00:11.6 Communication controller: VIA Technologies, Inc. AC'97 Modem Controller (rev 80) 01:00.0 VGA compatible controller: nVidia Corporation NV36.2 [GeForce FX 5700] (rev a1) and the problems relate only to Seagate 7200.10 SATA-disks, never with the older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller. Because this problem has been around for as long as I've had the Promise Sata300TX4 controller an additional new problem is that after kernel version 2.6.21-rc3-git10 the libata error handling/interface speed downgrade has been fixed ---> these new seagate disks get downgraded from UDMA/133 to UDMA/33 overnight (can the speed downgrade be disabled as a quick and dirty fix in this case somehow ?). For some reason the above mentioned libata error messages don't really do any noticeable harm but it would be very nice to be able to prevent the interface speed downgrade for now. >> What do you mean by "merged speed down fix"? Is your fix for the speed >> down logic implemented in the current kernel, so I don't have to patch >> anymore (except when I want to force 1.5Gbps right from the beginning)? > > Yeap, kernel will automatically downgrade to 1.5Gbps after several > failures. Yes, this feature seems to work quite nicely as the included logs show. Regards, Tomi Orava PS. These problems are not special to this single machine as a friend at work has the same Promise Sata300TX4 card with exactly the same Seagate 7200.10 SATA-disks on an intel-based P4 machine with similar problems under I/O-load. --------------------------------------------------------- scsi0 : sata_promise scsi1 : sata_promise scsi2 : sata_promise scsi3 : sata_promise ata1: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x00000000 irq 0 ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x00000000 irq 0 ata3: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x00000000 irq 0 ata4: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x00000000 irq 0 Switched to high resolution mode on CPU 0 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 ata1.00: ATA-6: ST3200822AS, 3.01, max UDMA/133 ata1.00: 390721968 sectors, multi 0: LBA48 ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 ata1.00: configured for UDMA/133 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133 ata2.00: 390721968 sectors, multi 0: LBA48 ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 ata2.00: configured for UDMA/133 ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata3.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133 ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata3.00: configured for UDMA/133 ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata4.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133 ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 ata4.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 sd 0:0:0:0: [sda] Attached SCSI disk sd 0:0:0:0: Attached scsi generic sg0 type 0 scsi 1:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5 sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sdb2 sd 1:0:0:0: [sdb] Attached SCSI disk sd 1:0:0:0: Attached scsi generic sg1 type 0 scsi 2:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5 sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: sdc1 sdc2 sd 2:0:0:0: [sdc] Attached SCSI disk sd 2:0:0:0: Attached scsi generic sg2 type 0 scsi 3:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5 sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdd: sdd1 sdd2 sd 3:0:0:0: [sdd] Attached SCSI disk sd 3:0:0:0: Attached scsi generic sg3 type 0 -- ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 6:08 ` Tomi Orava @ 2007-06-18 6:28 ` Tejun Heo 0 siblings, 0 replies; 41+ messages in thread From: Tejun Heo @ 2007-06-18 6:28 UTC (permalink / raw) To: Tomi Orava; +Cc: Florian Effenberger, linux-ide, jeff, Mikael Pettersson Hello, Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael Pettersson and quoting whole body for him. Mikael, does this look familiar? Tomi Orava wrote: > Hi Tejun, > > I've been trying to find a solution for a long time for quite a similar > libata errror messages as shown in this thread. Perhaps you might get have > some ideas what the actual originator might be: > > With the latest 2.6.22-rc4-git4 kernel I still get the following error > messages > with high I/O load: > > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) > sd 2:0:0:0: [sdc] Write Protect is off > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 > ata3.00: (port_status 0x20080000) > ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in > res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation) > ata3: soft resetting port > ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata3.00: configured for UDMA/133 > ata3: EH complete > > ... and later in the chain ... > > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) > sd 2:0:0:0: [sdc] Write Protect is off > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 > ata3.00: (port_status 0x20080000) > ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in > res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation) > ata3: soft resetting port > ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata3.00: configured for UDMA/100 > ata3: EH complete > > --- This goes on until UDMA/33 has been reched > > The problematic hardware combination is: > > 00:00.0 Host bridge: VIA Technologies, Inc. KT880 Host Bridge (rev 80) > 00:00.1 Host bridge: VIA Technologies, Inc. KT880 Host Bridge > 00:00.2 Host bridge: VIA Technologies, Inc. KT880 Host Bridge > 00:00.3 Host bridge: VIA Technologies, Inc. KT880 Host Bridge > 00:00.4 Host bridge: VIA Technologies, Inc. KT880 Host Bridge > 00:00.7 Host bridge: VIA Technologies, Inc. KT880 Host Bridge > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge > 00:09.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit > Ethernet Controller (rev 13) > 00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. > RTL-8139/8139C/8139C+ (rev 10) > 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. > RTL-8139/8139C/8139C+ (rev 10) > 00:0e.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA > 300 TX4) (rev 02) > 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID > Controller (rev 80) > 00:0f.1 IDE interface: VIA Technologies, Inc. > VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) > 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller (rev 81) > 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller (rev 81) > 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller (rev 81) > 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 > Controller (rev 81) > 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) > 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge > [KT600/K8T800/K8T890 South] > 00:11.5 Multimedia audio controller: VIA Technologies, Inc. > VT8233/A/8235/8237 AC97 Audio Controller (rev 60) > 00:11.6 Communication controller: VIA Technologies, Inc. AC'97 Modem > Controller (rev 80) > 01:00.0 VGA compatible controller: nVidia Corporation NV36.2 [GeForce FX > 5700] (rev a1) > > and the problems relate only to Seagate 7200.10 SATA-disks, never with the > older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller. > > Because this problem has been around for as long as I've had the Promise > Sata300TX4 controller an additional new problem is that after kernel > version 2.6.21-rc3-git10 the libata error handling/interface speed > downgrade has been fixed ---> these new seagate disks get downgraded from > UDMA/133 to UDMA/33 overnight (can the speed downgrade be disabled as a > quick and dirty fix in this case somehow ?). For some reason the above > mentioned libata error messages don't really do any noticeable harm but it > would be very nice to be able to prevent the interface speed downgrade for > now. > >>> What do you mean by "merged speed down fix"? Is your fix for the speed >>> down logic implemented in the current kernel, so I don't have to patch >>> anymore (except when I want to force 1.5Gbps right from the beginning)? >> Yeap, kernel will automatically downgrade to 1.5Gbps after several >> failures. > > Yes, this feature seems to work quite nicely as the included logs show. > > Regards, > Tomi Orava > > PS. These problems are not special to this single machine as a friend at work > has the same Promise Sata300TX4 card with exactly the same Seagate > 7200.10 > SATA-disks on an intel-based P4 machine with similar problems under > I/O-load. > > --------------------------------------------------------- > scsi0 : sata_promise > scsi1 : sata_promise > scsi2 : sata_promise > scsi3 : sata_promise > ata1: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x00000000 irq 0 > ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x00000000 irq 0 > ata3: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x00000000 irq 0 > ata4: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x00000000 irq 0 > Switched to high resolution mode on CPU 0 > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 > ata1.00: ATA-6: ST3200822AS, 3.01, max UDMA/133 > ata1.00: 390721968 sectors, multi 0: LBA48 > ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 > ata1.00: configured for UDMA/133 > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 > ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133 > ata2.00: 390721968 sectors, multi 0: LBA48 > ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 > ata2.00: configured for UDMA/133 > ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata3.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133 > ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata3.00: configured for UDMA/133 > ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata4.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133 > ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) > ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 > ata4.00: configured for UDMA/133 > scsi 0:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5 > sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sda: sda1 sda2 > sd 0:0:0:0: [sda] Attached SCSI disk > sd 0:0:0:0: Attached scsi generic sg0 type 0 > scsi 1:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5 > sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB) > sd 1:0:0:0: [sdb] Write Protect is off > sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB) > sd 1:0:0:0: [sdb] Write Protect is off > sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdb: sdb1 sdb2 > sd 1:0:0:0: [sdb] Attached SCSI disk > sd 1:0:0:0: Attached scsi generic sg1 type 0 > scsi 2:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5 > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) > sd 2:0:0:0: [sdc] Write Protect is off > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) > sd 2:0:0:0: [sdc] Write Protect is off > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdc: sdc1 sdc2 > sd 2:0:0:0: [sdc] Attached SCSI disk > sd 2:0:0:0: Attached scsi generic sg2 type 0 > scsi 3:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5 > sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) > sd 3:0:0:0: [sdd] Write Protect is off > sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB) > sd 3:0:0:0: [sdd] Write Protect is off > sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdd: sdd1 sdd2 > sd 3:0:0:0: [sdd] Attached SCSI disk > sd 3:0:0:0: Attached scsi generic sg3 type 0 > -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 3:10 ` Tejun Heo 2007-06-18 6:08 ` Tomi Orava @ 2007-06-18 10:38 ` Florian Effenberger 2007-06-18 10:44 ` Tejun Heo 1 sibling, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-18 10:38 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi, > Yeap, kernel will automatically downgrade to 1.5Gbps after several failures. is there also a boot-time option to force 1.5Gbps right from booting up? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 10:38 ` Florian Effenberger @ 2007-06-18 10:44 ` Tejun Heo 0 siblings, 0 replies; 41+ messages in thread From: Tejun Heo @ 2007-06-18 10:44 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: > Hi, > >> Yeap, kernel will automatically downgrade to 1.5Gbps after several >> failures. > > is there also a boot-time option to force 1.5Gbps right from booting up? Nope. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-07 9:50 ` Tejun Heo 2007-06-07 14:08 ` Florian Effenberger 2007-06-13 10:37 ` Florian Effenberger @ 2007-06-16 10:23 ` Florian Effenberger 2007-06-18 3:13 ` Tejun Heo 2 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-16 10:23 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi there, we tested out two 600W Fortron PSUs, also tried a BIOS update. Didn't work out. We also tried the jumper on the disks labelled SSP (Spread Spectrum Clocking), didn't work out out as well. What seemed to help at least a little bit is to use the 12V connector on the board, that is normally dedicated for graphic cards. The best test to reproduce the problem, according to a colleague also working on the machine, is a cat /dev/zero > zero.bin Do you still think it is a PSU or hardware problem? Do you need more details/logs? Thanks! Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-16 10:23 ` Florian Effenberger @ 2007-06-18 3:13 ` Tejun Heo 2007-06-18 10:44 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-06-18 3:13 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Hello, Florian Effenberger wrote: > we tested out two 600W Fortron PSUs, also tried a BIOS update. Didn't > work out. I see. > We also tried the jumper on the disks labelled SSP (Spread Spectrum > Clocking), didn't work out out as well. > > What seemed to help at least a little bit is to use the 12V connector on > the board, that is normally dedicated for graphic cards. Hmmmmm.... > The best test to reproduce the problem, according to a colleague also > working on the machine, is a cat /dev/zero > zero.bin > > Do you still think it is a PSU or hardware problem? Do you need more > details/logs? The controller being ich8, I'm pretty sure it isn't a driver problem. Do the errors occur on all four drives? Also, if things work after speed is downgraded to 1.5Gbps, it doesn't really matter. There's no noticeable performance difference for single disk anyway. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 3:13 ` Tejun Heo @ 2007-06-18 10:44 ` Florian Effenberger 2007-06-18 10:56 ` Tejun Heo 0 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-18 10:44 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi, > The controller being ich8, I'm pretty sure it isn't a driver problem. I think so, too. The Intel chipsets have shown to be very good in the past. > Do the errors occur on all four drives? Also, if things work after > speed is downgraded to 1.5Gbps, it doesn't really matter. There's no > noticeable performance difference for single disk anyway. Yes, they do occur on all drives, as far as I know. With 1.5Gbps, the error doesn't occur much as often and not under normal circumstances, only when doing a real hard stress test. Would it make sense to downgrade to 1.5 Gbps via a boot option? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 10:44 ` Florian Effenberger @ 2007-06-18 10:56 ` Tejun Heo 2007-06-18 11:28 ` Florian Effenberger 2007-06-24 11:32 ` Florian Effenberger 0 siblings, 2 replies; 41+ messages in thread From: Tejun Heo @ 2007-06-18 10:56 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: > Hi, > >> The controller being ich8, I'm pretty sure it isn't a driver problem. > > I think so, too. The Intel chipsets have shown to be very good in the past. > >> Do the errors occur on all four drives? Also, if things work after >> speed is downgraded to 1.5Gbps, it doesn't really matter. There's no >> noticeable performance difference for single disk anyway. > > Yes, they do occur on all drives, as far as I know. With 1.5Gbps, the > error doesn't occur much as often and not under normal circumstances, > only when doing a real hard stress test. Hmmm... Can you use a separate PSU to power two of the four drives and see what happens? Just power up a PSU as directed in the following webpage and connect two of the harddrives to the PSU. http://modtown.co.uk/mt/article2.php?id=psumod > Would it make sense to downgrade to 1.5 Gbps via a boot option? I don't know. Till now all the problem cases have been isolated to a specific controller / drive combination (sata_promise and newer seagate drives) or hardware configuration problem (most of them being PSU issues), so I don't think we need such option yet. If you have a problematic hardware which pukes on 3.0Gbps, libata should do the right thing after complaining a bit which IMHO isn't too bad. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 10:56 ` Tejun Heo @ 2007-06-18 11:28 ` Florian Effenberger 2007-06-18 11:30 ` Tejun Heo 2007-06-24 11:32 ` Florian Effenberger 1 sibling, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-18 11:28 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi Tejun, > Hmmm... Can you use a separate PSU to power two of the four drives and > see what happens? Just power up a PSU as directed in the following > webpage and connect two of the harddrives to the PSU. > > http://modtown.co.uk/mt/article2.php?id=psumod thanks for that link, we will try that and keep you updated what happens! > I don't know. Till now all the problem cases have been isolated to a > specific controller / drive combination (sata_promise and newer seagate > drives) or hardware configuration problem (most of them being PSU > issues), so I don't think we need such option yet. If you have a > problematic hardware which pukes on 3.0Gbps, libata should do the right > thing after complaining a bit which IMHO isn't too bad. So, loss of data or data corruption can't occur, even when we have to wait until the speed is limited? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 11:28 ` Florian Effenberger @ 2007-06-18 11:30 ` Tejun Heo 2007-06-18 11:32 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-06-18 11:30 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: >> I don't know. Till now all the problem cases have been isolated to a >> specific controller / drive combination (sata_promise and newer seagate >> drives) or hardware configuration problem (most of them being PSU >> issues), so I don't think we need such option yet. If you have a >> problematic hardware which pukes on 3.0Gbps, libata should do the right >> thing after complaining a bit which IMHO isn't too bad. > > So, loss of data or data corruption can't occur, even when we have to > wait until the speed is limited? Nope, there's nothing to worry about. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 11:30 ` Tejun Heo @ 2007-06-18 11:32 ` Florian Effenberger 0 siblings, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-06-18 11:32 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi, > Nope, there's nothing to worry about. okay, thanks a lot so far, it is good to know that developers are there to help. ;-) I will let you know how it turned out with the second PSU. Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 10:56 ` Tejun Heo 2007-06-18 11:28 ` Florian Effenberger @ 2007-06-24 11:32 ` Florian Effenberger 2007-06-25 2:49 ` Tejun Heo 1 sibling, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-06-24 11:32 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi there, sorry, it seems it was all a false alert and our mainboard was defective. At the end, it turned on only sometimes. To test it, we wanted to install Windows, which didn't work as well. Now the dealer changed the motherboard, and we are just fine with 3.0 Gbps and Kernel 2.6.21.5. Sorry for the big confusion and for your great help! I didn't know the board was defective in the first place, there have been no indications like that... Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-24 11:32 ` Florian Effenberger @ 2007-06-25 2:49 ` Tejun Heo 2007-06-25 8:47 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-06-25 2:49 UTC (permalink / raw) To: Florian Effenberger; +Cc: linux-ide, jeff Florian Effenberger wrote: > sorry, it seems it was all a false alert and our mainboard was > defective. At the end, it turned on only sometimes. To test it, we > wanted to install Windows, which didn't work as well. > > Now the dealer changed the motherboard, and we are just fine with 3.0 > Gbps and Kernel 2.6.21.5. > > Sorry for the big confusion and for your great help! I didn't know the > board was defective in the first place, there have been no indications > like that... Yeah, things like these are tricky. SATA is usually the first one to suffer from hardware defect including power fluctuation due to input power, PSU or on-board voltage regulator problems because the link is relatively long and runs at very high speed. I also heard that SATA cables should have been made more resistant to interference but I'm no expert in that area. It's interesting to see how it got solved. Thanks for another data point to blame hardware when I don't have a clue. :-) -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-25 2:49 ` Tejun Heo @ 2007-06-25 8:47 ` Florian Effenberger 0 siblings, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-06-25 8:47 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, jeff Hi Tejun, > Yeah, things like these are tricky. SATA is usually the first one to > suffer from hardware defect including power fluctuation due to input > power, PSU or on-board voltage regulator problems because the link is > relatively long and runs at very high speed. I also heard that SATA > cables should have been made more resistant to interference but I'm no > expert in that area. me neither. I first thought of a driver issue, because the machine just ran fine and started to have mysterious effects some weeks later... > It's interesting to see how it got solved. Thanks for another data > point to blame hardware when I don't have a clue. :-) Hehe, you're welcome. ;-) Thanks for all your efforts, I really appreciate them! Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
@ 2007-06-18 7:05 Mikael Pettersson
2007-06-18 7:13 ` Tejun Heo
` (2 more replies)
0 siblings, 3 replies; 41+ messages in thread
From: Mikael Pettersson @ 2007-06-18 7:05 UTC (permalink / raw)
To: Tomi.Orava, htejun; +Cc: florian, jeff, linux-ide, mikpe
On Mon, 18 Jun 2007 15:28:44 +0900, Tejun Heo wrote:
> Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael
> Pettersson and quoting whole body for him. Mikael, does this look familiar?
>
> Tomi Orava wrote:
> > Hi Tejun,
> >
> > I've been trying to find a solution for a long time for quite a similar
> > libata errror messages as shown in this thread. Perhaps you might get have
> > some ideas what the actual originator might be:
> >
> > With the latest 2.6.22-rc4-git4 kernel I still get the following error
> > messages
> > with high I/O load:
> >
> > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> > sd 2:0:0:0: [sdc] Write Protect is off
> > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> > support DPO or FUA
> > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> > ata3.00: (port_status 0x20080000)
> > ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in
> > res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation)
> > ata3: soft resetting port
> > ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: configured for UDMA/133
> > ata3: EH complete
> >
> > ... and later in the chain ...
> >
> > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> > sd 2:0:0:0: [sdc] Write Protect is off
> > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> > support DPO or FUA
> > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> > ata3.00: (port_status 0x20080000)
> > ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in
> > res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation)
> > ata3: soft resetting port
> > ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: configured for UDMA/100
> > ata3: EH complete
> >
> > --- This goes on until UDMA/33 has been reched
...
> > and the problems relate only to Seagate 7200.10 SATA-disks, never with the
> > older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller.
...
> > PS. These problems are not special to this single machine as a friend at work
> > has the same Promise Sata300TX4 card with exactly the same Seagate
> > 7200.10
> > SATA-disks on an intel-based P4 machine with similar problems under
> > I/O-load.
Yes, this is familiar. Several people have reported problems with
Seagate's 7200.10 disks in 3Gbps operation on sata_promise.
Unfortunately the error reports don't really give a clue as to what
the root cause is.
I used to be able to forcibly trigger similar errors with their
7200.9 disks, but I can't seem to do that any more.
/Mikael
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: libata interface fatal error 2007-06-18 7:05 Mikael Pettersson @ 2007-06-18 7:13 ` Tejun Heo 2007-06-18 10:47 ` Florian Effenberger 2007-06-18 17:14 ` Ansgar Knappheide 2007-06-18 18:54 ` Tomi Orava 2 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-06-18 7:13 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Tomi.Orava, florian, jeff, linux-ide Mikael Pettersson wrote: > Yes, this is familiar. Several people have reported problems with > Seagate's 7200.10 disks in 3Gbps operation on sata_promise. > Unfortunately the error reports don't really give a clue as to what > the root cause is. > > I used to be able to forcibly trigger similar errors with their > 7200.9 disks, but I can't seem to do that any more. Maybe we need to limit link speed to 1.5Gbps for these drives on sata_promise? -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 7:13 ` Tejun Heo @ 2007-06-18 10:47 ` Florian Effenberger 0 siblings, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-06-18 10:47 UTC (permalink / raw) To: Tejun Heo; +Cc: Mikael Pettersson, Tomi.Orava, jeff, linux-ide Hi, > Maybe we need to limit link speed to 1.5Gbps for these drives on > sata_promise? in our case, it's a Vendor: ATA Model: WDC WD1600YS-01S Rev: 20.0 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0 Type: Direct-Access ANSI SCSI revision: 05 on a 00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02) 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2) 00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02) 01:00.0 VGA compatible controller: nVidia Corporation Unknown device 016a (rev a1) 03:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12) 04:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 04:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 05:01.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 0c) 05:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) Maybe blacklisting makes sense here, too? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 7:05 Mikael Pettersson 2007-06-18 7:13 ` Tejun Heo @ 2007-06-18 17:14 ` Ansgar Knappheide 2007-06-18 18:54 ` Tomi Orava 2 siblings, 0 replies; 41+ messages in thread From: Ansgar Knappheide @ 2007-06-18 17:14 UTC (permalink / raw) To: linux-ide Mikael Pettersson schrieb: > On Mon, 18 Jun 2007 15:28:44 +0900, Tejun Heo wrote: > >> Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael >> Pettersson and quoting whole body for him. Mikael, does this look familiar? >> >> Tomi Orava wrote: >> >>> Hi Tejun, >>> >>> I've been trying to find a solution for a long time for quite a similar >>> libata errror messages as shown in this thread. Perhaps you might get have >>> some ideas what the actual originator might be: >>> >>> With the latest 2.6.22-rc4-git4 kernel I still get the following error >>> messages >>> with high I/O load: >>> >>> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) >>> sd 2:0:0:0: [sdc] Write Protect is off >>> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 >>> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't >>> support DPO or FUA >>> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 >>> ata3.00: (port_status 0x20080000) >>> ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in >>> res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation) >>> ata3: soft resetting port >>> ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 >>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 >>> ata3.00: configured for UDMA/133 >>> ata3: EH complete >>> >>> ... and later in the chain ... >>> >>> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB) >>> sd 2:0:0:0: [sdc] Write Protect is off >>> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 >>> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't >>> support DPO or FUA >>> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 >>> ata3.00: (port_status 0x20080000) >>> ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in >>> res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation) >>> ata3: soft resetting port >>> ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 >>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168 >>> ata3.00: configured for UDMA/100 >>> ata3: EH complete >>> >>> --- This goes on until UDMA/33 has been reched >>> > ... > >>> and the problems relate only to Seagate 7200.10 SATA-disks, never with the >>> older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller. >>> > ... > >>> PS. These problems are not special to this single machine as a friend at work >>> has the same Promise Sata300TX4 card with exactly the same Seagate >>> 7200.10 >>> SATA-disks on an intel-based P4 machine with similar problems under >>> I/O-load. >>> > > Yes, this is familiar. Several people have reported problems with > Seagate's 7200.10 disks in 3Gbps operation on sata_promise. > Unfortunately the error reports don't really give a clue as to what > the root cause is. > > I used to be able to forcibly trigger similar errors with their > 7200.9 disks, but I can't seem to do that any more. > > Hello, I'm jumping in this thread, because I'm seeing the same probleme on my system with Promise SATAII 150 TX4 (PDC40518) and harddrive Maxtor 6L200M0 (BANC1E00) with following error Jun 18 01:16:03 buffy kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 Jun 18 01:16:03 buffy kernel: ata1.00: (port_status 0x20080000) Jun 18 01:16:03 buffy kernel: ata1.00: cmd c8/00:15:e1:e3:16/00:00:00:00:00/e6 tag 0 cdb 0x0 data 10752 in Jun 18 01:16:03 buffy kernel: res 50/00:00:f5:e3:16/00:00:00:00:00/e6 Emask 0x2 (HSM violation) Jun 18 01:16:03 buffy kernel: ata1: soft resetting port Jun 18 01:16:03 buffy kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Jun 18 01:16:03 buffy kernel: ata1.00: ata_hpa_resize 1: sectors = 398297088, hpa_sectors = 398297088 Jun 18 01:16:03 buffy kernel: ata1.00: ata_hpa_resize 1: sectors = 398297088, hpa_sectors = 398297088 Jun 18 01:16:03 buffy kernel: ata1.00: configured for UDMA/133 Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] Jun 18 01:16:03 buffy kernel: Descriptor sense data with sense descriptors (in hex): Jun 18 01:16:03 buffy kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Jun 18 01:16:03 buffy kernel: 06 16 e3 f5 Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 Jun 18 01:16:03 buffy kernel: end_request: I/O error, dev sda, sector 102163425 Jun 18 01:16:03 buffy kernel: ata1: EH complete Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB) Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Write Protect is off Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA On normal use this error shows up only once a week, but when transfering lot of data (> 100MB) to USB-Stick that error shows every few seconds with only different values for data. When transfering data from USB-Stick to harddrive no error shows. Other information on my system: smartctl -d sat -a /dev/sda smartctl version 5.38 [i686-suse-linux] Copyright (C) 2002-7 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Maxtor DiamondMax 10 family (ATA/133 and SATA/150) Device Model: Maxtor 6L200M0 Serial Number: L40A4PDH Firmware Version: BANC1E00 User Capacity: 203.928.109.056 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 Local Time is: Mon Jun 18 19:11:50 2007 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Warning! SMART Attribute Thresholds Structure error: invalid SMART checksum. === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x02) Offline data collection activity was completed without error. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (1562) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 81) minutes. SCT capabilities: (0x0021) SCT Status supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0027 206 204 063 Pre-fail Always - 10179 4 Start_Stop_Count 0x0032 253 253 000 Old_age Always - 1502 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 0 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0 7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0 8 Seek_Time_Performance 0x0027 246 240 187 Pre-fail Always - 37304 9 Power_On_Minutes 0x0032 239 239 000 Old_age Always - 539h+13m 10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 250 250 000 Old_age Always - 1570 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0 194 Temperature_Celsius 0x0032 031 253 000 Old_age Always - 33 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 9263 196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline - 0 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline - 0 198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 0 202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 0 204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always - 0 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0 207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0 208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0 209 Offline_Seek_Performnce 0x0024 239 239 000 Old_age Offline - 179 210 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 211 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 212 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 Warning! SMART ATA Error Log Structure error: invalid SMART checksum. SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1163 - # 2 Short offline Completed without error 00% 1163 - # 3 Offline Aborted by host 70% 0 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. hdparm -I /dev/sda /dev/sda: ATA device, with non-removable media Model Number: Maxtor 6L200M0 Serial Number: L40A4PDH Firmware Revision: BANC1E00 Standards: Used: ATA/ATAPI-7 T13 1532D revision 0 Supported: 7 6 5 4 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 398297088 device size with M = 1024*1024: 194481 MBytes device size with M = 1000*1000: 203928 MBytes (203 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16 Current = 0 Advanced power management level: unknown setting (0x0000) Recommended acoustic management value: 192, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_VERIFY command * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Advanced Power Management feature set SET_MAX security extension * Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * WRITE_{DMA|MULTIPLE}_FUA_EXT * SATA-I signaling speed (1.5Gb/s) * Native Command Queueing (NCQ) Software settings preservation * SMART Command Transport (SCT) feature set * SCT Data Tables (AC5) Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase Checksum: correct lspci 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge 00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge 00:06.0 Ethernet controller: D-Link System Inc RTL8139 Ethernet (rev 10) 00:07.0 Mass storage controller: Promise Technology, Inc. PDC20518/PDC40518 (SATAII 150 TX4) (rev 02) 00:0b.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 08) 00:0b.1 Input device controller: Creative Labs SB Live! Game Port (rev 08) 00:0c.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) 00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 01:00.0 VGA compatible controller: nVidia Corporation NV25 [GeForce4 Ti 4200] (rev a3) Perhaps this will help to resolve the problem Ansgar ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-06-18 7:05 Mikael Pettersson 2007-06-18 7:13 ` Tejun Heo 2007-06-18 17:14 ` Ansgar Knappheide @ 2007-06-18 18:54 ` Tomi Orava 2 siblings, 0 replies; 41+ messages in thread From: Tomi Orava @ 2007-06-18 18:54 UTC (permalink / raw) Cc: htejun, florian, jeff, linux-ide, mikpe > On Mon, 18 Jun 2007 15:28:44 +0900, Tejun Heo wrote: >> Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael >> Pettersson and quoting whole body for him. Mikael, does this look >> familiar? >> >> Tomi Orava wrote: >> > Hi Tejun, >> > >> > I've been trying to find a solution for a long time for quite a >> similar >> > libata errror messages as shown in this thread. Perhaps you might get >> have >> > some ideas what the actual originator might be: >> > >> > With the latest 2.6.22-rc4-git4 kernel I still get the following error >> > messages >> > with high I/O load: <snip> >> > and the problems relate only to Seagate 7200.10 SATA-disks, never with >> the >> > older 7200.7 SATA-disks alll connected to Promise Sata >> 300TX4-controller. > ... >> > PS. These problems are not special to this single machine as a friend >> at work >> > has the same Promise Sata300TX4 card with exactly the same >> Seagate >> > 7200.10 >> > SATA-disks on an intel-based P4 machine with similar problems >> under >> > I/O-load. > > Yes, this is familiar. Several people have reported problems with > Seagate's 7200.10 disks in 3Gbps operation on sata_promise. > Unfortunately the error reports don't really give a clue as to what > the root cause is. > > I used to be able to forcibly trigger similar errors with their > 7200.9 disks, but I can't seem to do that any more. Hmm, are you really sure that this is 3Gbps mode related ? I'm wondering about that as the problem is there no matter if the 1.5Gbps jumper is set on the 7200.10 disks or not. Also I retested your older sata_promise 1.5Gbps speed limit patch and it did not fix the problem. This is really strange! I've now connected the problematic two 7200.10 disks into Via VT6420 controller and the problem has been fixed for me (for now). It would be great to figure out what is the actual problem here though ... Regards, Tomi Orava -- Tomi.Orava@ncircle.nullnet.fi ^ permalink raw reply [flat|nested] 41+ messages in thread
* libata interface fatal error
@ 2007-05-24 13:25 Florian Effenberger
2007-05-24 13:45 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-05-24 13:25 UTC (permalink / raw)
To: jgarzik, linux-ide
Hi there,
seems I've always subscribed to SATA problems. :-)
We installed Debian Etch with the pre-compiled kernel, but when doing
heavy SATA data transfer, the drives seem to make trouble. Even with the
latest kernel, 2.6.21.2, we receive:
===
ata3.00: exception Emask 0x10 SAct 0x1 SErr 0x400100 action 0x2 frozen
ata3.00: (irq_stat 0x08000000, interface fatal error)
ata3.00: cmd 61/80:00:00:91:91/00:00:1d:00:00/40 tag 0 cdb 0x0 data
65536 out
res 40/00:04:00:91:91/00:00:1d:00:00/40 Emask 0x10 (ATA bus error)
ata3: soft resetting port
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3: EH complete
SCSI device sdc: 625142448 512-byte hdwr sectors (320073 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
===
MD5 sums of copied files are right and we experience no other problems.
Is this a driver bug? If so, can I be of any help in debugging it?
lspci gives:
===
00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub
(rev 02)
00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port
(rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI
Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller
(rev 02)
01:00.0 VGA compatible controller: nVidia Corporation Unknown device
016a (rev a1)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown
device 4364 (rev 12)
04:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363
AHCI Controller (rev 02)
04:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363
AHCI Controller (rev 02)
05:01.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 0c)
05:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23
IEEE-1394a-2000 Controller (PHY/Link)
===
Thanks
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: libata interface fatal error 2007-05-24 13:25 Florian Effenberger @ 2007-05-24 13:45 ` Tejun Heo 2007-05-24 14:08 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-05-24 13:45 UTC (permalink / raw) To: Florian Effenberger; +Cc: jgarzik, linux-ide Hello, Florian Effenberger wrote: > We installed Debian Etch with the pre-compiled kernel, but when doing > heavy SATA data transfer, the drives seem to make trouble. Even with the > latest kernel, 2.6.21.2, we receive: > > === > ata3.00: exception Emask 0x10 SAct 0x1 SErr 0x400100 action 0x2 frozen > ata3.00: (irq_stat 0x08000000, interface fatal error) > ata3.00: cmd 61/80:00:00:91:91/00:00:1d:00:00/40 tag 0 cdb 0x0 data > 65536 out > res 40/00:04:00:91:91/00:00:1d:00:00/40 Emask 0x10 (ATA bus error) > ata3: soft resetting port > ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > ata3.00: configured for UDMA/133 > ata3: EH complete > SCSI device sdc: 625142448 512-byte hdwr sectors (320073 MB) > sdc: Write Protect is off > sdc: Mode Sense: 00 3a 00 00 > SCSI device sdc: write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > === Looks like a genuine transmission/interface error to me. How often does this occur? Please try to connect the drive to another port using and possibly different power lane. Also, testing with another drive is a good way to track down where the problem is. > MD5 sums of copied files are right and we experience no other problems. > Is this a driver bug? If so, can I be of any help in debugging it? Yeah, libata EH is working properly so there shouldn't be any problem other than the error messages and a bit slower transfer speed. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 13:45 ` Tejun Heo @ 2007-05-24 14:08 ` Florian Effenberger 2007-05-24 14:21 ` Tejun Heo 0 siblings, 1 reply; 41+ messages in thread From: Florian Effenberger @ 2007-05-24 14:08 UTC (permalink / raw) To: Tejun Heo; +Cc: jgarzik, linux-ide Hi, thanks for the fast reply! > Looks like a genuine transmission/interface error to me. How often does > this occur? Please try to connect the drive to another port using and > possibly different power lane. Also, testing with another drive is a > good way to track down where the problem is. it occurs as soon as the drive is being used heavily (load of about 2,x on the machine when running our test scripts). About 15 times in 2 or 3 hours. Will try to change port, power supply and drive. > Yeah, libata EH is working properly so there shouldn't be any problem > other than the error messages and a bit slower transfer speed. So, even if the errors are still there, there is nothing real to worry about for me? There are now new errors with hard errors, is this still ok? === ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata4.00: cmd 60/80:00:00:09:97/00:00:0a:00:00/40 tag 0 cdb 0x0 data 65536 in res 40/00:04:00:67:14/00:00:1c:00:00/40 Emask 0x4 (timeout) ata4: soft resetting port ata4: softreset failed (1st FIS failed) ata4: softreset failed, retrying in 5 secs ata4: hard resetting port ata4: port is slow to respond, please be patient (Status 0x80) ata4: port failed to respond (30 secs, Status 0x80) ata4: COMRESET failed (device not ready) ata4: hardreset failed, retrying in 5 secs ata4: hard resetting port ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata4.00: configured for UDMA/133 ata4: EH complete SCSI device sdd: 625142448 512-byte hdwr sectors (320073 MB) sdd: Write Protect is off sdd: Mode Sense: 00 3a 00 00 SCSI device sdd: write cache: enabled, read cache: enabled, doesn't support DPO or FUA === Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:08 ` Florian Effenberger @ 2007-05-24 14:21 ` Tejun Heo 2007-05-24 14:47 ` Florian Effenberger 0 siblings, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-05-24 14:21 UTC (permalink / raw) To: Florian Effenberger; +Cc: jgarzik, linux-ide Florian Effenberger wrote: >> Looks like a genuine transmission/interface error to me. How often does >> this occur? Please try to connect the drive to another port using and >> possibly different power lane. Also, testing with another drive is a >> good way to track down where the problem is. > > it occurs as soon as the drive is being used heavily (load of about 2,x > on the machine when running our test scripts). About 15 times in 2 or 3 > hours. Will try to change port, power supply and drive. > >> Yeah, libata EH is working properly so there shouldn't be any problem >> other than the error messages and a bit slower transfer speed. > > So, even if the errors are still there, there is nothing real to worry > about for me? Data integrity wise there should be no problem but your error rate is pretty high and eventually will make libata turn off NCQ and/or speed down PHY speed. > There are now new errors with hard errors, is this still ok? > > === > ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen > ata4.00: cmd 60/80:00:00:09:97/00:00:0a:00:00/40 tag 0 cdb 0x0 data > 65536 in > res 40/00:04:00:67:14/00:00:1c:00:00/40 Emask 0x4 (timeout) Yeap, your data is safe. With timeouts, data transfer speed can be much lower tho. It definitely seems something is wrong with your hardware setup. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:21 ` Tejun Heo @ 2007-05-24 14:47 ` Florian Effenberger 2007-05-24 14:53 ` Tejun Heo 2007-05-24 14:55 ` Greg Freemyer 0 siblings, 2 replies; 41+ messages in thread From: Florian Effenberger @ 2007-05-24 14:47 UTC (permalink / raw) To: Tejun Heo; +Cc: jgarzik, linux-ide Hi, > Data integrity wise there should be no problem but your error rate is > pretty high and eventually will make libata turn off NCQ and/or speed > down PHY speed. switching ports is not easy. Both on-board SATA controllers are being used, and the error seems to occur on all ports. > Yeap, your data is safe. With timeouts, data transfer speed can be much > lower tho. It definitely seems something is wrong with your hardware setup. I will try to use another test disk. Right now we use different models of Western Digital "RAID edition". Any other debug information I could gather? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:47 ` Florian Effenberger @ 2007-05-24 14:53 ` Tejun Heo 2007-05-24 15:28 ` Florian Effenberger 2007-05-24 14:55 ` Greg Freemyer 1 sibling, 1 reply; 41+ messages in thread From: Tejun Heo @ 2007-05-24 14:53 UTC (permalink / raw) To: Florian Effenberger; +Cc: jgarzik, linux-ide Florian Effenberger wrote: > Hi, > >> Data integrity wise there should be no problem but your error rate is >> pretty high and eventually will make libata turn off NCQ and/or speed >> down PHY speed. > > switching ports is not easy. Both on-board SATA controllers are being > used, and the error seems to occur on all ports. Hmmmm... >> Yeap, your data is safe. With timeouts, data transfer speed can be much >> lower tho. It definitely seems something is wrong with your hardware >> setup. > > I will try to use another test disk. Right now we use different models > of Western Digital "RAID edition". > > Any other debug information I could gather? If you let the system run, libata will turn off NCQ and/or lower PHY speed to 1.5Gbps. Do errors disappear after that happens? -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:53 ` Tejun Heo @ 2007-05-24 15:28 ` Florian Effenberger 0 siblings, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-05-24 15:28 UTC (permalink / raw) To: Tejun Heo; +Cc: jgarzik, linux-ide We just disabled the RAID (Linux software RAID, no hardware RAID) and tested with one disk only, same results What string should I grep the logs for when things are being lowered? > If you let the system run, libata will turn off NCQ and/or lower PHY > speed to 1.5Gbps. Do errors disappear after that happens? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:47 ` Florian Effenberger 2007-05-24 14:53 ` Tejun Heo @ 2007-05-24 14:55 ` Greg Freemyer 2007-05-24 14:59 ` Tejun Heo 2007-05-24 15:00 ` Florian Effenberger 1 sibling, 2 replies; 41+ messages in thread From: Greg Freemyer @ 2007-05-24 14:55 UTC (permalink / raw) To: Florian Effenberger; +Cc: Tejun Heo, jgarzik, linux-ide On 5/24/07, Florian Effenberger <florian@effenberger.org> wrote: > Hi, > > > Data integrity wise there should be no problem but your error rate is > > pretty high and eventually will make libata turn off NCQ and/or speed > > down PHY speed. > > switching ports is not easy. Both on-board SATA controllers are being > used, and the error seems to occur on all ports. > > > Yeap, your data is safe. With timeouts, data transfer speed can be much > > lower tho. It definitely seems something is wrong with your hardware setup. > > I will try to use another test disk. Right now we use different models > of Western Digital "RAID edition". > iiuc, raid editions are designed to fail fast thus allowing an alternate drive to provide the data rather than having to wait thru multiple internal retries. Could this just be a case of the drive functioning as designed? Greg -- Greg Freemyer The Norcross Group Forensics for the 21st Century ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:55 ` Greg Freemyer @ 2007-05-24 14:59 ` Tejun Heo 2007-05-24 15:00 ` Florian Effenberger 1 sibling, 0 replies; 41+ messages in thread From: Tejun Heo @ 2007-05-24 14:59 UTC (permalink / raw) To: Greg Freemyer; +Cc: Florian Effenberger, jgarzik, linux-ide Greg Freemyer wrote: >> I will try to use another test disk. Right now we use different models >> of Western Digital "RAID edition". >> > > iiuc, raid editions are designed to fail fast thus allowing an > alternate drive to provide the data rather than having to wait thru > multiple internal retries. > > Could this just be a case of the drive functioning as designed? If that's the case, the drive should be aborting commands with ICRC bit set reporting unrecoverable media error (AC_ERR_DEV | AC_ERR_MEDIA in libata terms) but the errors are fatal interface errors and timeouts, both of which are indicative of transmission problems on ATA link. -- tejun ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error 2007-05-24 14:55 ` Greg Freemyer 2007-05-24 14:59 ` Tejun Heo @ 2007-05-24 15:00 ` Florian Effenberger 1 sibling, 0 replies; 41+ messages in thread From: Florian Effenberger @ 2007-05-24 15:00 UTC (permalink / raw) To: Greg Freemyer; +Cc: Tejun Heo, jgarzik, linux-ide Hi, > iiuc, raid editions are designed to fail fast thus allowing an > alternate drive to provide the data rather than having to wait thru > multiple internal retries. > > Could this just be a case of the drive functioning as designed? to be honest, I don't know. :-) Any jumper settings to change that, or any driver settings? Are you aware of something like that? Florian ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2007-06-25 8:47 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-26 9:43 libata interface fatal error Florian Effenberger 2007-05-29 9:16 ` Tejun Heo 2007-05-29 14:16 ` Florian Effenberger 2007-06-06 21:23 ` Florian Effenberger 2007-06-07 9:50 ` Tejun Heo 2007-06-07 14:08 ` Florian Effenberger 2007-06-13 10:37 ` Florian Effenberger 2007-06-14 9:43 ` Tejun Heo 2007-06-14 11:12 ` Florian Effenberger 2007-06-14 12:25 ` Tejun Heo 2007-06-14 15:12 ` Florian Effenberger 2007-06-18 3:10 ` Tejun Heo 2007-06-18 6:08 ` Tomi Orava 2007-06-18 6:28 ` Tejun Heo 2007-06-18 10:38 ` Florian Effenberger 2007-06-18 10:44 ` Tejun Heo 2007-06-16 10:23 ` Florian Effenberger 2007-06-18 3:13 ` Tejun Heo 2007-06-18 10:44 ` Florian Effenberger 2007-06-18 10:56 ` Tejun Heo 2007-06-18 11:28 ` Florian Effenberger 2007-06-18 11:30 ` Tejun Heo 2007-06-18 11:32 ` Florian Effenberger 2007-06-24 11:32 ` Florian Effenberger 2007-06-25 2:49 ` Tejun Heo 2007-06-25 8:47 ` Florian Effenberger -- strict thread matches above, loose matches on Subject: below -- 2007-06-18 7:05 Mikael Pettersson 2007-06-18 7:13 ` Tejun Heo 2007-06-18 10:47 ` Florian Effenberger 2007-06-18 17:14 ` Ansgar Knappheide 2007-06-18 18:54 ` Tomi Orava 2007-05-24 13:25 Florian Effenberger 2007-05-24 13:45 ` Tejun Heo 2007-05-24 14:08 ` Florian Effenberger 2007-05-24 14:21 ` Tejun Heo 2007-05-24 14:47 ` Florian Effenberger 2007-05-24 14:53 ` Tejun Heo 2007-05-24 15:28 ` Florian Effenberger 2007-05-24 14:55 ` Greg Freemyer 2007-05-24 14:59 ` Tejun Heo 2007-05-24 15:00 ` Florian Effenberger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).