* libata interface fatal error
@ 2007-05-24 13:25 Florian Effenberger
2007-05-24 13:45 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-05-24 13:25 UTC (permalink / raw)
To: jgarzik, linux-ide
Hi there,
seems I've always subscribed to SATA problems. :-)
We installed Debian Etch with the pre-compiled kernel, but when doing
heavy SATA data transfer, the drives seem to make trouble. Even with the
latest kernel, 2.6.21.2, we receive:
===
ata3.00: exception Emask 0x10 SAct 0x1 SErr 0x400100 action 0x2 frozen
ata3.00: (irq_stat 0x08000000, interface fatal error)
ata3.00: cmd 61/80:00:00:91:91/00:00:1d:00:00/40 tag 0 cdb 0x0 data
65536 out
res 40/00:04:00:91:91/00:00:1d:00:00/40 Emask 0x10 (ATA bus error)
ata3: soft resetting port
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3: EH complete
SCSI device sdc: 625142448 512-byte hdwr sectors (320073 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
===
MD5 sums of copied files are right and we experience no other problems.
Is this a driver bug? If so, can I be of any help in debugging it?
lspci gives:
===
00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub
(rev 02)
00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port
(rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI
Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller
(rev 02)
01:00.0 VGA compatible controller: nVidia Corporation Unknown device
016a (rev a1)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown
device 4364 (rev 12)
04:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363
AHCI Controller (rev 02)
04:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363
AHCI Controller (rev 02)
05:01.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 0c)
05:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23
IEEE-1394a-2000 Controller (PHY/Link)
===
Thanks
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 13:25 Florian Effenberger
@ 2007-05-24 13:45 ` Tejun Heo
2007-05-24 14:08 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-05-24 13:45 UTC (permalink / raw)
To: Florian Effenberger; +Cc: jgarzik, linux-ide
Hello,
Florian Effenberger wrote:
> We installed Debian Etch with the pre-compiled kernel, but when doing
> heavy SATA data transfer, the drives seem to make trouble. Even with the
> latest kernel, 2.6.21.2, we receive:
>
> ===
> ata3.00: exception Emask 0x10 SAct 0x1 SErr 0x400100 action 0x2 frozen
> ata3.00: (irq_stat 0x08000000, interface fatal error)
> ata3.00: cmd 61/80:00:00:91:91/00:00:1d:00:00/40 tag 0 cdb 0x0 data
> 65536 out
> res 40/00:04:00:91:91/00:00:1d:00:00/40 Emask 0x10 (ATA bus error)
> ata3: soft resetting port
> ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata3.00: configured for UDMA/133
> ata3: EH complete
> SCSI device sdc: 625142448 512-byte hdwr sectors (320073 MB)
> sdc: Write Protect is off
> sdc: Mode Sense: 00 3a 00 00
> SCSI device sdc: write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ===
Looks like a genuine transmission/interface error to me. How often does
this occur? Please try to connect the drive to another port using and
possibly different power lane. Also, testing with another drive is a
good way to track down where the problem is.
> MD5 sums of copied files are right and we experience no other problems.
> Is this a driver bug? If so, can I be of any help in debugging it?
Yeah, libata EH is working properly so there shouldn't be any problem
other than the error messages and a bit slower transfer speed.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 13:45 ` Tejun Heo
@ 2007-05-24 14:08 ` Florian Effenberger
2007-05-24 14:21 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-05-24 14:08 UTC (permalink / raw)
To: Tejun Heo; +Cc: jgarzik, linux-ide
Hi,
thanks for the fast reply!
> Looks like a genuine transmission/interface error to me. How often does
> this occur? Please try to connect the drive to another port using and
> possibly different power lane. Also, testing with another drive is a
> good way to track down where the problem is.
it occurs as soon as the drive is being used heavily (load of about 2,x
on the machine when running our test scripts). About 15 times in 2 or 3
hours. Will try to change port, power supply and drive.
> Yeah, libata EH is working properly so there shouldn't be any problem
> other than the error messages and a bit slower transfer speed.
So, even if the errors are still there, there is nothing real to worry
about for me?
There are now new errors with hard errors, is this still ok?
===
ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen
ata4.00: cmd 60/80:00:00:09:97/00:00:0a:00:00/40 tag 0 cdb 0x0 data 65536 in
res 40/00:04:00:67:14/00:00:1c:00:00/40 Emask 0x4 (timeout)
ata4: soft resetting port
ata4: softreset failed (1st FIS failed)
ata4: softreset failed, retrying in 5 secs
ata4: hard resetting port
ata4: port is slow to respond, please be patient (Status 0x80)
ata4: port failed to respond (30 secs, Status 0x80)
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
SCSI device sdd: 625142448 512-byte hdwr sectors (320073 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
===
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:08 ` Florian Effenberger
@ 2007-05-24 14:21 ` Tejun Heo
2007-05-24 14:47 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-05-24 14:21 UTC (permalink / raw)
To: Florian Effenberger; +Cc: jgarzik, linux-ide
Florian Effenberger wrote:
>> Looks like a genuine transmission/interface error to me. How often does
>> this occur? Please try to connect the drive to another port using and
>> possibly different power lane. Also, testing with another drive is a
>> good way to track down where the problem is.
>
> it occurs as soon as the drive is being used heavily (load of about 2,x
> on the machine when running our test scripts). About 15 times in 2 or 3
> hours. Will try to change port, power supply and drive.
>
>> Yeah, libata EH is working properly so there shouldn't be any problem
>> other than the error messages and a bit slower transfer speed.
>
> So, even if the errors are still there, there is nothing real to worry
> about for me?
Data integrity wise there should be no problem but your error rate is
pretty high and eventually will make libata turn off NCQ and/or speed
down PHY speed.
> There are now new errors with hard errors, is this still ok?
>
> ===
> ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen
> ata4.00: cmd 60/80:00:00:09:97/00:00:0a:00:00/40 tag 0 cdb 0x0 data
> 65536 in
> res 40/00:04:00:67:14/00:00:1c:00:00/40 Emask 0x4 (timeout)
Yeap, your data is safe. With timeouts, data transfer speed can be much
lower tho. It definitely seems something is wrong with your hardware setup.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:21 ` Tejun Heo
@ 2007-05-24 14:47 ` Florian Effenberger
2007-05-24 14:53 ` Tejun Heo
2007-05-24 14:55 ` Greg Freemyer
0 siblings, 2 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-05-24 14:47 UTC (permalink / raw)
To: Tejun Heo; +Cc: jgarzik, linux-ide
Hi,
> Data integrity wise there should be no problem but your error rate is
> pretty high and eventually will make libata turn off NCQ and/or speed
> down PHY speed.
switching ports is not easy. Both on-board SATA controllers are being
used, and the error seems to occur on all ports.
> Yeap, your data is safe. With timeouts, data transfer speed can be much
> lower tho. It definitely seems something is wrong with your hardware setup.
I will try to use another test disk. Right now we use different models
of Western Digital "RAID edition".
Any other debug information I could gather?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:47 ` Florian Effenberger
@ 2007-05-24 14:53 ` Tejun Heo
2007-05-24 15:28 ` Florian Effenberger
2007-05-24 14:55 ` Greg Freemyer
1 sibling, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-05-24 14:53 UTC (permalink / raw)
To: Florian Effenberger; +Cc: jgarzik, linux-ide
Florian Effenberger wrote:
> Hi,
>
>> Data integrity wise there should be no problem but your error rate is
>> pretty high and eventually will make libata turn off NCQ and/or speed
>> down PHY speed.
>
> switching ports is not easy. Both on-board SATA controllers are being
> used, and the error seems to occur on all ports.
Hmmmm...
>> Yeap, your data is safe. With timeouts, data transfer speed can be much
>> lower tho. It definitely seems something is wrong with your hardware
>> setup.
>
> I will try to use another test disk. Right now we use different models
> of Western Digital "RAID edition".
>
> Any other debug information I could gather?
If you let the system run, libata will turn off NCQ and/or lower PHY
speed to 1.5Gbps. Do errors disappear after that happens?
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:47 ` Florian Effenberger
2007-05-24 14:53 ` Tejun Heo
@ 2007-05-24 14:55 ` Greg Freemyer
2007-05-24 14:59 ` Tejun Heo
2007-05-24 15:00 ` Florian Effenberger
1 sibling, 2 replies; 41+ messages in thread
From: Greg Freemyer @ 2007-05-24 14:55 UTC (permalink / raw)
To: Florian Effenberger; +Cc: Tejun Heo, jgarzik, linux-ide
On 5/24/07, Florian Effenberger <florian@effenberger.org> wrote:
> Hi,
>
> > Data integrity wise there should be no problem but your error rate is
> > pretty high and eventually will make libata turn off NCQ and/or speed
> > down PHY speed.
>
> switching ports is not easy. Both on-board SATA controllers are being
> used, and the error seems to occur on all ports.
>
> > Yeap, your data is safe. With timeouts, data transfer speed can be much
> > lower tho. It definitely seems something is wrong with your hardware setup.
>
> I will try to use another test disk. Right now we use different models
> of Western Digital "RAID edition".
>
iiuc, raid editions are designed to fail fast thus allowing an
alternate drive to provide the data rather than having to wait thru
multiple internal retries.
Could this just be a case of the drive functioning as designed?
Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:55 ` Greg Freemyer
@ 2007-05-24 14:59 ` Tejun Heo
2007-05-24 15:00 ` Florian Effenberger
1 sibling, 0 replies; 41+ messages in thread
From: Tejun Heo @ 2007-05-24 14:59 UTC (permalink / raw)
To: Greg Freemyer; +Cc: Florian Effenberger, jgarzik, linux-ide
Greg Freemyer wrote:
>> I will try to use another test disk. Right now we use different models
>> of Western Digital "RAID edition".
>>
>
> iiuc, raid editions are designed to fail fast thus allowing an
> alternate drive to provide the data rather than having to wait thru
> multiple internal retries.
>
> Could this just be a case of the drive functioning as designed?
If that's the case, the drive should be aborting commands with ICRC bit
set reporting unrecoverable media error (AC_ERR_DEV | AC_ERR_MEDIA in
libata terms) but the errors are fatal interface errors and timeouts,
both of which are indicative of transmission problems on ATA link.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:55 ` Greg Freemyer
2007-05-24 14:59 ` Tejun Heo
@ 2007-05-24 15:00 ` Florian Effenberger
1 sibling, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-05-24 15:00 UTC (permalink / raw)
To: Greg Freemyer; +Cc: Tejun Heo, jgarzik, linux-ide
Hi,
> iiuc, raid editions are designed to fail fast thus allowing an
> alternate drive to provide the data rather than having to wait thru
> multiple internal retries.
>
> Could this just be a case of the drive functioning as designed?
to be honest, I don't know. :-)
Any jumper settings to change that, or any driver settings? Are you
aware of something like that?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-24 14:53 ` Tejun Heo
@ 2007-05-24 15:28 ` Florian Effenberger
0 siblings, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-05-24 15:28 UTC (permalink / raw)
To: Tejun Heo; +Cc: jgarzik, linux-ide
We just disabled the RAID (Linux software RAID, no hardware RAID) and
tested with one disk only, same results
What string should I grep the logs for when things are being lowered?
> If you let the system run, libata will turn off NCQ and/or lower PHY
> speed to 1.5Gbps. Do errors disappear after that happens?
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
@ 2007-05-26 9:43 Florian Effenberger
2007-05-29 9:16 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-05-26 9:43 UTC (permalink / raw)
To: linux-ide; +Cc: htejun, jeff
Hi,
it seems that the speed is never lowered, I always see "SATA link up 3.0
Gbps (SStatus 123 SControl 300)".
Can I manually lower the speed via a kernel parameter?
Thanks
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-26 9:43 Florian Effenberger
@ 2007-05-29 9:16 ` Tejun Heo
2007-05-29 14:16 ` Florian Effenberger
2007-06-06 21:23 ` Florian Effenberger
0 siblings, 2 replies; 41+ messages in thread
From: Tejun Heo @ 2007-05-29 9:16 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
> Hi,
>
> it seems that the speed is never lowered, I always see "SATA link up 3.0
> Gbps (SStatus 123 SControl 300)".
>
> Can I manually lower the speed via a kernel parameter?
Currently, there is no mechanism to do that but hard drives usually have
dip switch to force 1.5Gbps. Please try that. If your harddrive
doesn't have that, please lemme know. I'll prepare a simple patch.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-29 9:16 ` Tejun Heo
@ 2007-05-29 14:16 ` Florian Effenberger
2007-06-06 21:23 ` Florian Effenberger
1 sibling, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-05-29 14:16 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi Tejun,
> Currently, there is no mechanism to do that but hard drives usually have
> dip switch to force 1.5Gbps. Please try that. If your harddrive
> doesn't have that, please lemme know. I'll prepare a simple patch.
thanks a lot, I will try that out and tell you the results. :-)
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-05-29 9:16 ` Tejun Heo
2007-05-29 14:16 ` Florian Effenberger
@ 2007-06-06 21:23 ` Florian Effenberger
2007-06-07 9:50 ` Tejun Heo
1 sibling, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-06 21:23 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi,
> Currently, there is no mechanism to do that but hard drives usually have
> dip switch to force 1.5Gbps. Please try that. If your harddrive
> doesn't have that, please lemme know. I'll prepare a simple patch.
unfortunately, the disks have a jumper board, but the jumpers are
missing... could you write a patch for me? Would be much appreciated!
Thanks a lot!
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-06 21:23 ` Florian Effenberger
@ 2007-06-07 9:50 ` Tejun Heo
2007-06-07 14:08 ` Florian Effenberger
` (2 more replies)
0 siblings, 3 replies; 41+ messages in thread
From: Tejun Heo @ 2007-06-07 9:50 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
[-- Attachment #1: Type: text/plain, Size: 612 bytes --]
Florian Effenberger wrote:
> Hi,
>
>> Currently, there is no mechanism to do that but hard drives usually have
>> dip switch to force 1.5Gbps. Please try that. If your harddrive
>> doesn't have that, please lemme know. I'll prepare a simple patch.
>
> unfortunately, the disks have a jumper board, but the jumpers are
> missing... could you write a patch for me? Would be much appreciated!
Okay, there was a bug in link speed limit logic. That's probably why
speed down to 1.5Gbps didn't kick in. The attached patch contains the
fix and hack to force 1.5Gbps. Please give it a shot.
Thanks.
--
tejun
[-- Attachment #2: ahci-force-1_5.patch --]
[-- Type: text/x-patch, Size: 2156 bytes --]
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 7baeaff..f9550f1 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -219,6 +219,7 @@ static int ahci_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
static unsigned int ahci_qc_issue(struct ata_queued_cmd *qc);
static void ahci_irq_clear(struct ata_port *ap);
static int ahci_port_start(struct ata_port *ap);
+static int ahci_vt8251_port_start(struct ata_port *ap);
static void ahci_port_stop(struct ata_port *ap);
static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf);
static void ahci_qc_prep(struct ata_queued_cmd *qc);
@@ -284,7 +285,7 @@ static const struct ata_port_operations ahci_ops = {
.port_resume = ahci_port_resume,
#endif
- .port_start = ahci_port_start,
+ .port_start = ahci_vt8251_port_start,
.port_stop = ahci_port_stop,
};
@@ -318,7 +319,7 @@ static const struct ata_port_operations ahci_vt8251_ops = {
.port_resume = ahci_port_resume,
#endif
- .port_start = ahci_port_start,
+ .port_start = ahci_vt8251_port_start,
.port_stop = ahci_port_stop,
};
@@ -1558,6 +1559,19 @@ static int ahci_port_start(struct ata_port *ap)
return 0;
}
+static int ahci_vt8251_port_start(struct ata_port *ap)
+{
+ struct ahci_host_priv *hpriv = ap->host->private_data;
+
+ if (((hpriv->cap >> 20) & 0xf) != 1) {
+ printk("limiting SATA link speed to 1.5Gbps\n");
+ ap->hw_sata_spd_limit = 1;
+ ap->eh_info.action |= ATA_EH_HARDRESET;
+ }
+
+ return ahci_port_start(ap);
+}
+
static void ahci_port_stop(struct ata_port *ap)
{
const char *emsg = NULL;
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4733f00..57940ba 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6313,7 +6313,8 @@ int ata_host_register(struct ata_host *host, struct scsi_host_template *sht)
/* init sata_spd_limit to the current value */
if (sata_scr_read(ap, SCR_CONTROL, &scontrol) == 0) {
int spd = (scontrol >> 4) & 0xf;
- ap->hw_sata_spd_limit &= (1 << spd) - 1;
+ if (spd)
+ ap->hw_sata_spd_limit &= (1 << spd) - 1;
}
ap->sata_spd_limit = ap->hw_sata_spd_limit;
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-07 9:50 ` Tejun Heo
@ 2007-06-07 14:08 ` Florian Effenberger
2007-06-13 10:37 ` Florian Effenberger
2007-06-16 10:23 ` Florian Effenberger
2 siblings, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-06-07 14:08 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi Tejun,
> Okay, there was a bug in link speed limit logic. That's probably why
> speed down to 1.5Gbps didn't kick in. The attached patch contains the
> fix and hack to force 1.5Gbps. Please give it a shot.
thanks a lot for your help, much appreciated! Will test it and let you
know if it works.
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-07 9:50 ` Tejun Heo
2007-06-07 14:08 ` Florian Effenberger
@ 2007-06-13 10:37 ` Florian Effenberger
2007-06-14 9:43 ` Tejun Heo
2007-06-16 10:23 ` Florian Effenberger
2 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-13 10:37 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi Tejun,
> Okay, there was a bug in link speed limit logic. That's probably why
> speed down to 1.5Gbps didn't kick in. The attached patch contains the
> fix and hack to force 1.5Gbps. Please give it a shot.
thanks a lot for your patch, it seems to work, at least better than
without patch. :-)
When rsyncing about 12 GB, no trouble occured. When doing heavy stress
tests, I receive errors again, but okay, maybe that's due to a hardware bug.
Will your patch go into the vanilla kernel?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-13 10:37 ` Florian Effenberger
@ 2007-06-14 9:43 ` Tejun Heo
2007-06-14 11:12 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-06-14 9:43 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
> Hi Tejun,
>
>> Okay, there was a bug in link speed limit logic. That's probably why
>> speed down to 1.5Gbps didn't kick in. The attached patch contains the
>> fix and hack to force 1.5Gbps. Please give it a shot.
>
> thanks a lot for your patch, it seems to work, at least better than
> without patch. :-)
>
> When rsyncing about 12 GB, no trouble occured. When doing heavy stress
> tests, I receive errors again, but okay, maybe that's due to a hardware
> bug.
>
> Will your patch go into the vanilla kernel?
I'm currently not sure what the root cause is
1. if the controller is at fault, we need to force 1.5Gbps on the
controller.
2. if the drive model is broken, we need to blacklist the drives.
3. if your specific configuration is broken (faulty hw, PSU, bad karma),
the upstream speed limit fix patch should be enough.
Can you post the result of 'hdparm -I /dev/sdX'?
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-14 9:43 ` Tejun Heo
@ 2007-06-14 11:12 ` Florian Effenberger
2007-06-14 12:25 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-14 11:12 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
[-- Attachment #1: Type: text/plain, Size: 222 bytes --]
Hi,
> Can you post the result of 'hdparm -I /dev/sdX'?
thanks a lot for your kind support, that is much appreciated!
Attached is some machine output, hope that helps. Let me know I you need
more information.
Florian
[-- Attachment #2: debug.txt --]
[-- Type: text/plain, Size: 18541 bytes --]
00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation Unknown device 016a (rev a1)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12)
04:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02)
04:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02)
05:01.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 0c)
05:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)
/dev/md0:
Version : 00.90.03
Creation Time : Tue May 1 15:56:11 2007
Raid Level : raid5
Array Size : 937713408 (894.27 GiB 960.22 GB)
Device Size : 312571136 (298.09 GiB 320.07 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu Jun 14 12:40:44 2007
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 6e3c156d:c91eb028:40daae21:698c531b
Events : 0.62
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
/dev/sda:
ATA device, with non-removable media
Model Number: WDC WD1600YS-01SHB1
Serial Number: WD-WCAP01819659
Firmware Revision: 20.06C06
Standards:
Supported: 7 6 5 4
Likely used: 7
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 321670847
device size with M = 1024*1024: 157065 MBytes
device size with M = 1000*1000: 164695 MBytes (164 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
DMA Setup Auto-Activate optimization
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12]
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
52min for SECURITY ERASE UNIT.
Checksum: correct
/dev/sdb:
ATA device, with non-removable media
Model Number: WDC WD3200YS-01PGB0
Serial Number: WD-WCAPD3405080
Firmware Revision: 21.00M21
Standards:
Supported: 7 6 5 4
Likely used: 7
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 625142448
device size with M = 1024*1024: 305245 MBytes
device size with M = 1000*1000: 320072 MBytes (320 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 1
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
DMA Setup Auto-Activate optimization
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12]
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Checksum: correct
/dev/sdc:
ATA device, with non-removable media
Model Number: WDC WD3200YS-01PGB0
Serial Number: WD-WCAPD4087913
Firmware Revision: 21.00M21
Standards:
Supported: 7 6 5 4
Likely used: 7
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 625142448
device size with M = 1024*1024: 305245 MBytes
device size with M = 1000*1000: 320072 MBytes (320 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 1
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
DMA Setup Auto-Activate optimization
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12]
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Checksum: correct
/dev/sdd:
ATA device, with non-removable media
Model Number: WDC WD3200YS-01PGB0
Serial Number: WD-WCAPD4124047
Firmware Revision: 21.00M21
Standards:
Supported: 7 6 5 4
Likely used: 7
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 625142448
device size with M = 1024*1024: 305245 MBytes
device size with M = 1000*1000: 320072 MBytes (320 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 1
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
DMA Setup Auto-Activate optimization
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12]
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Checksum: correct
/dev/sde:
ATA device, with non-removable media
Model Number: WDC WD3200YS-01PGB0
Serial Number: WD-WCAPD3406202
Firmware Revision: 21.00M21
Standards:
Supported: 7 6 5 4
Likely used: 7
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 625142448
device size with M = 1024*1024: 305245 MBytes
device size with M = 1000*1000: 320072 MBytes (320 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 1
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
DMA Setup Auto-Activate optimization
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12]
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Checksum: correct
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-14 11:12 ` Florian Effenberger
@ 2007-06-14 12:25 ` Tejun Heo
2007-06-14 15:12 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-06-14 12:25 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
> Hi,
>
>> Can you post the result of 'hdparm -I /dev/sdX'?
>
> thanks a lot for your kind support, that is much appreciated!
>
> Attached is some machine output, hope that helps. Let me know I you need
> more information.
Okay, ich8. I don't think the chipset is at fault here and you have a
lot of disks. My primary suspect is power supply problem but things
like this are hard to prove. With the merged speed down fix, libata
will do the right thing after a few errors, so ignoring the problem
wouldn't be a too bad idea. If you're curious, you can try to connect
drives to different SATA ports and power lanes and see whether errors
follow the disk, port or power lane.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-14 12:25 ` Tejun Heo
@ 2007-06-14 15:12 ` Florian Effenberger
2007-06-18 3:10 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-14 15:12 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi Tejun,
> Okay, ich8. I don't think the chipset is at fault here and you have a
> lot of disks. My primary suspect is power supply problem but things
> like this are hard to prove. With the merged speed down fix, libata
> will do the right thing after a few errors, so ignoring the problem
> wouldn't be a too bad idea. If you're curious, you can try to connect
> drives to different SATA ports and power lanes and see whether errors
> follow the disk, port or power lane.
exactly, should be four disks in the machine.
What power supply would you recommend for this type of disks? I think we
got a 450W Enermax, IIRC.
What do you mean by "merged speed down fix"? Is your fix for the speed
down logic implemented in the current kernel, so I don't have to patch
anymore (except when I want to force 1.5Gbps right from the beginning)?
All SATA parts are used, so reconnecting is not an option. But I can try
to switch the power supply (lanes).
Thanks for all your kind help, that is much appreciated!
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-07 9:50 ` Tejun Heo
2007-06-07 14:08 ` Florian Effenberger
2007-06-13 10:37 ` Florian Effenberger
@ 2007-06-16 10:23 ` Florian Effenberger
2007-06-18 3:13 ` Tejun Heo
2 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-16 10:23 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi there,
we tested out two 600W Fortron PSUs, also tried a BIOS update. Didn't
work out.
We also tried the jumper on the disks labelled SSP (Spread Spectrum
Clocking), didn't work out out as well.
What seemed to help at least a little bit is to use the 12V connector on
the board, that is normally dedicated for graphic cards.
The best test to reproduce the problem, according to a colleague also
working on the machine, is a cat /dev/zero > zero.bin
Do you still think it is a PSU or hardware problem? Do you need more
details/logs?
Thanks!
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-14 15:12 ` Florian Effenberger
@ 2007-06-18 3:10 ` Tejun Heo
2007-06-18 6:08 ` Tomi Orava
2007-06-18 10:38 ` Florian Effenberger
0 siblings, 2 replies; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 3:10 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Hello,
Florian Effenberger wrote:
> What power supply would you recommend for this type of disks? I think we
> got a 450W Enermax, IIRC.
Most power supplies should be able to do 4 disks without any problem
unless it's broken.
> What do you mean by "merged speed down fix"? Is your fix for the speed
> down logic implemented in the current kernel, so I don't have to patch
> anymore (except when I want to force 1.5Gbps right from the beginning)?
Yeap, kernel will automatically downgrade to 1.5Gbps after several failures.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-16 10:23 ` Florian Effenberger
@ 2007-06-18 3:13 ` Tejun Heo
2007-06-18 10:44 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 3:13 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Hello,
Florian Effenberger wrote:
> we tested out two 600W Fortron PSUs, also tried a BIOS update. Didn't
> work out.
I see.
> We also tried the jumper on the disks labelled SSP (Spread Spectrum
> Clocking), didn't work out out as well.
>
> What seemed to help at least a little bit is to use the 12V connector on
> the board, that is normally dedicated for graphic cards.
Hmmmmm....
> The best test to reproduce the problem, according to a colleague also
> working on the machine, is a cat /dev/zero > zero.bin
>
> Do you still think it is a PSU or hardware problem? Do you need more
> details/logs?
The controller being ich8, I'm pretty sure it isn't a driver problem.
Do the errors occur on all four drives? Also, if things work after
speed is downgraded to 1.5Gbps, it doesn't really matter. There's no
noticeable performance difference for single disk anyway.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 3:10 ` Tejun Heo
@ 2007-06-18 6:08 ` Tomi Orava
2007-06-18 6:28 ` Tejun Heo
2007-06-18 10:38 ` Florian Effenberger
1 sibling, 1 reply; 41+ messages in thread
From: Tomi Orava @ 2007-06-18 6:08 UTC (permalink / raw)
To: Tejun Heo; +Cc: Florian Effenberger, linux-ide, jeff
Hi Tejun,
I've been trying to find a solution for a long time for quite a similar
libata errror messages as shown in this thread. Perhaps you might get have
some ideas what the actual originator might be:
With the latest 2.6.22-rc4-git4 kernel I still get the following error
messages
with high I/O load:
sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata3.00: (port_status 0x20080000)
ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in
res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation)
ata3: soft resetting port
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata3.00: configured for UDMA/133
ata3: EH complete
... and later in the chain ...
sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata3.00: (port_status 0x20080000)
ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in
res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation)
ata3: soft resetting port
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata3.00: configured for UDMA/100
ata3: EH complete
--- This goes on until UDMA/33 has been reched
The problematic hardware combination is:
00:00.0 Host bridge: VIA Technologies, Inc. KT880 Host Bridge (rev 80)
00:00.1 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
00:00.2 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
00:00.3 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
00:00.4 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
00:00.7 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
00:09.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit
Ethernet Controller (rev 13)
00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
00:0e.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
300 TX4) (rev 02)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID
Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
[KT600/K8T800/K8T890 South]
00:11.5 Multimedia audio controller: VIA Technologies, Inc.
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
00:11.6 Communication controller: VIA Technologies, Inc. AC'97 Modem
Controller (rev 80)
01:00.0 VGA compatible controller: nVidia Corporation NV36.2 [GeForce FX
5700] (rev a1)
and the problems relate only to Seagate 7200.10 SATA-disks, never with the
older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller.
Because this problem has been around for as long as I've had the Promise
Sata300TX4 controller an additional new problem is that after kernel
version 2.6.21-rc3-git10 the libata error handling/interface speed
downgrade has been fixed ---> these new seagate disks get downgraded from
UDMA/133 to UDMA/33 overnight (can the speed downgrade be disabled as a
quick and dirty fix in this case somehow ?). For some reason the above
mentioned libata error messages don't really do any noticeable harm but it
would be very nice to be able to prevent the interface speed downgrade for
now.
>> What do you mean by "merged speed down fix"? Is your fix for the speed
>> down logic implemented in the current kernel, so I don't have to patch
>> anymore (except when I want to force 1.5Gbps right from the beginning)?
>
> Yeap, kernel will automatically downgrade to 1.5Gbps after several
> failures.
Yes, this feature seems to work quite nicely as the included logs show.
Regards,
Tomi Orava
PS. These problems are not special to this single machine as a friend at work
has the same Promise Sata300TX4 card with exactly the same Seagate
7200.10
SATA-disks on an intel-based P4 machine with similar problems under
I/O-load.
---------------------------------------------------------
scsi0 : sata_promise
scsi1 : sata_promise
scsi2 : sata_promise
scsi3 : sata_promise
ata1: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x00000000 irq 0
ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x00000000 irq 0
ata3: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x00000000 irq 0
ata4: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x00000000 irq 0
Switched to high resolution mode on CPU 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
ata1.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
ata1.00: 390721968 sectors, multi 0: LBA48
ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
ata1.00: configured for UDMA/133
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
ata2.00: 390721968 sectors, multi 0: LBA48
ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
ata2.00: configured for UDMA/133
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata3.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133
ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata3.00: configured for UDMA/133
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata4.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133
ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
ata4.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
scsi 1:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sdb: sdb1 sdb2
sd 1:0:0:0: [sdb] Attached SCSI disk
sd 1:0:0:0: Attached scsi generic sg1 type 0
scsi 2:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5
sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sdc: sdc1 sdc2
sd 2:0:0:0: [sdc] Attached SCSI disk
sd 2:0:0:0: Attached scsi generic sg2 type 0
scsi 3:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5
sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sdd: sdd1 sdd2
sd 3:0:0:0: [sdd] Attached SCSI disk
sd 3:0:0:0: Attached scsi generic sg3 type 0
--
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 6:08 ` Tomi Orava
@ 2007-06-18 6:28 ` Tejun Heo
0 siblings, 0 replies; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 6:28 UTC (permalink / raw)
To: Tomi Orava; +Cc: Florian Effenberger, linux-ide, jeff, Mikael Pettersson
Hello,
Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael
Pettersson and quoting whole body for him. Mikael, does this look familiar?
Tomi Orava wrote:
> Hi Tejun,
>
> I've been trying to find a solution for a long time for quite a similar
> libata errror messages as shown in this thread. Perhaps you might get have
> some ideas what the actual originator might be:
>
> With the latest 2.6.22-rc4-git4 kernel I still get the following error
> messages
> with high I/O load:
>
> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> sd 2:0:0:0: [sdc] Write Protect is off
> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> ata3.00: (port_status 0x20080000)
> ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in
> res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation)
> ata3: soft resetting port
> ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata3.00: configured for UDMA/133
> ata3: EH complete
>
> ... and later in the chain ...
>
> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> sd 2:0:0:0: [sdc] Write Protect is off
> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> ata3.00: (port_status 0x20080000)
> ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in
> res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation)
> ata3: soft resetting port
> ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata3.00: configured for UDMA/100
> ata3: EH complete
>
> --- This goes on until UDMA/33 has been reched
>
> The problematic hardware combination is:
>
> 00:00.0 Host bridge: VIA Technologies, Inc. KT880 Host Bridge (rev 80)
> 00:00.1 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
> 00:00.2 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
> 00:00.3 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
> 00:00.4 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
> 00:00.7 Host bridge: VIA Technologies, Inc. KT880 Host Bridge
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
> 00:09.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit
> Ethernet Controller (rev 13)
> 00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL-8139/8139C/8139C+ (rev 10)
> 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL-8139/8139C/8139C+ (rev 10)
> 00:0e.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
> 300 TX4) (rev 02)
> 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID
> Controller (rev 80)
> 00:0f.1 IDE interface: VIA Technologies, Inc.
> VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller (rev 81)
> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller (rev 81)
> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller (rev 81)
> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
> Controller (rev 81)
> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
> [KT600/K8T800/K8T890 South]
> 00:11.5 Multimedia audio controller: VIA Technologies, Inc.
> VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
> 00:11.6 Communication controller: VIA Technologies, Inc. AC'97 Modem
> Controller (rev 80)
> 01:00.0 VGA compatible controller: nVidia Corporation NV36.2 [GeForce FX
> 5700] (rev a1)
>
> and the problems relate only to Seagate 7200.10 SATA-disks, never with the
> older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller.
>
> Because this problem has been around for as long as I've had the Promise
> Sata300TX4 controller an additional new problem is that after kernel
> version 2.6.21-rc3-git10 the libata error handling/interface speed
> downgrade has been fixed ---> these new seagate disks get downgraded from
> UDMA/133 to UDMA/33 overnight (can the speed downgrade be disabled as a
> quick and dirty fix in this case somehow ?). For some reason the above
> mentioned libata error messages don't really do any noticeable harm but it
> would be very nice to be able to prevent the interface speed downgrade for
> now.
>
>>> What do you mean by "merged speed down fix"? Is your fix for the speed
>>> down logic implemented in the current kernel, so I don't have to patch
>>> anymore (except when I want to force 1.5Gbps right from the beginning)?
>> Yeap, kernel will automatically downgrade to 1.5Gbps after several
>> failures.
>
> Yes, this feature seems to work quite nicely as the included logs show.
>
> Regards,
> Tomi Orava
>
> PS. These problems are not special to this single machine as a friend at work
> has the same Promise Sata300TX4 card with exactly the same Seagate
> 7200.10
> SATA-disks on an intel-based P4 machine with similar problems under
> I/O-load.
>
> ---------------------------------------------------------
> scsi0 : sata_promise
> scsi1 : sata_promise
> scsi2 : sata_promise
> scsi3 : sata_promise
> ata1: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x00000000 irq 0
> ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x00000000 irq 0
> ata3: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x00000000 irq 0
> ata4: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x00000000 irq 0
> Switched to high resolution mode on CPU 0
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
> ata1.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
> ata1.00: 390721968 sectors, multi 0: LBA48
> ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
> ata1.00: configured for UDMA/133
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
> ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
> ata2.00: 390721968 sectors, multi 0: LBA48
> ata2.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968
> ata2.00: configured for UDMA/133
> ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata3.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133
> ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata3.00: configured for UDMA/133
> ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata4.00: ATA-7: ST3500630AS, 3.AAK, max UDMA/133
> ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
> ata4.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> ata4.00: configured for UDMA/133
> scsi 0:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5
> sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sda: sda1 sda2
> sd 0:0:0:0: [sda] Attached SCSI disk
> sd 0:0:0:0: Attached scsi generic sg0 type 0
> scsi 1:0:0:0: Direct-Access ATA ST3200822AS 3.01 PQ: 0 ANSI: 5
> sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB)
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sd 1:0:0:0: [sdb] 390721968 512-byte hardware sectors (200050 MB)
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sdb: sdb1 sdb2
> sd 1:0:0:0: [sdb] Attached SCSI disk
> sd 1:0:0:0: Attached scsi generic sg1 type 0
> scsi 2:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5
> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> sd 2:0:0:0: [sdc] Write Protect is off
> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> sd 2:0:0:0: [sdc] Write Protect is off
> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sdc: sdc1 sdc2
> sd 2:0:0:0: [sdc] Attached SCSI disk
> sd 2:0:0:0: Attached scsi generic sg2 type 0
> scsi 3:0:0:0: Direct-Access ATA ST3500630AS 3.AA PQ: 0 ANSI: 5
> sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> sd 3:0:0:0: [sdd] Write Protect is off
> sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> sd 3:0:0:0: [sdd] Write Protect is off
> sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sdd: sdd1 sdd2
> sd 3:0:0:0: [sdd] Attached SCSI disk
> sd 3:0:0:0: Attached scsi generic sg3 type 0
>
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
@ 2007-06-18 7:05 Mikael Pettersson
2007-06-18 7:13 ` Tejun Heo
` (2 more replies)
0 siblings, 3 replies; 41+ messages in thread
From: Mikael Pettersson @ 2007-06-18 7:05 UTC (permalink / raw)
To: Tomi.Orava, htejun; +Cc: florian, jeff, linux-ide, mikpe
On Mon, 18 Jun 2007 15:28:44 +0900, Tejun Heo wrote:
> Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael
> Pettersson and quoting whole body for him. Mikael, does this look familiar?
>
> Tomi Orava wrote:
> > Hi Tejun,
> >
> > I've been trying to find a solution for a long time for quite a similar
> > libata errror messages as shown in this thread. Perhaps you might get have
> > some ideas what the actual originator might be:
> >
> > With the latest 2.6.22-rc4-git4 kernel I still get the following error
> > messages
> > with high I/O load:
> >
> > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> > sd 2:0:0:0: [sdc] Write Protect is off
> > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> > support DPO or FUA
> > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> > ata3.00: (port_status 0x20080000)
> > ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in
> > res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation)
> > ata3: soft resetting port
> > ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: configured for UDMA/133
> > ata3: EH complete
> >
> > ... and later in the chain ...
> >
> > sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> > sd 2:0:0:0: [sdc] Write Protect is off
> > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> > support DPO or FUA
> > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> > ata3.00: (port_status 0x20080000)
> > ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in
> > res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation)
> > ata3: soft resetting port
> > ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
> > ata3.00: configured for UDMA/100
> > ata3: EH complete
> >
> > --- This goes on until UDMA/33 has been reched
...
> > and the problems relate only to Seagate 7200.10 SATA-disks, never with the
> > older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller.
...
> > PS. These problems are not special to this single machine as a friend at work
> > has the same Promise Sata300TX4 card with exactly the same Seagate
> > 7200.10
> > SATA-disks on an intel-based P4 machine with similar problems under
> > I/O-load.
Yes, this is familiar. Several people have reported problems with
Seagate's 7200.10 disks in 3Gbps operation on sata_promise.
Unfortunately the error reports don't really give a clue as to what
the root cause is.
I used to be able to forcibly trigger similar errors with their
7200.9 disks, but I can't seem to do that any more.
/Mikael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 7:05 libata interface fatal error Mikael Pettersson
@ 2007-06-18 7:13 ` Tejun Heo
2007-06-18 10:47 ` Florian Effenberger
2007-06-18 17:14 ` Ansgar Knappheide
2007-06-18 18:54 ` Tomi Orava
2 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 7:13 UTC (permalink / raw)
To: Mikael Pettersson; +Cc: Tomi.Orava, florian, jeff, linux-ide
Mikael Pettersson wrote:
> Yes, this is familiar. Several people have reported problems with
> Seagate's 7200.10 disks in 3Gbps operation on sata_promise.
> Unfortunately the error reports don't really give a clue as to what
> the root cause is.
>
> I used to be able to forcibly trigger similar errors with their
> 7200.9 disks, but I can't seem to do that any more.
Maybe we need to limit link speed to 1.5Gbps for these drives on
sata_promise?
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 3:10 ` Tejun Heo
2007-06-18 6:08 ` Tomi Orava
@ 2007-06-18 10:38 ` Florian Effenberger
2007-06-18 10:44 ` Tejun Heo
1 sibling, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-18 10:38 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi,
> Yeap, kernel will automatically downgrade to 1.5Gbps after several failures.
is there also a boot-time option to force 1.5Gbps right from booting up?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 10:38 ` Florian Effenberger
@ 2007-06-18 10:44 ` Tejun Heo
0 siblings, 0 replies; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 10:44 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
> Hi,
>
>> Yeap, kernel will automatically downgrade to 1.5Gbps after several
>> failures.
>
> is there also a boot-time option to force 1.5Gbps right from booting up?
Nope.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 3:13 ` Tejun Heo
@ 2007-06-18 10:44 ` Florian Effenberger
2007-06-18 10:56 ` Tejun Heo
0 siblings, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-18 10:44 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi,
> The controller being ich8, I'm pretty sure it isn't a driver problem.
I think so, too. The Intel chipsets have shown to be very good in the past.
> Do the errors occur on all four drives? Also, if things work after
> speed is downgraded to 1.5Gbps, it doesn't really matter. There's no
> noticeable performance difference for single disk anyway.
Yes, they do occur on all drives, as far as I know. With 1.5Gbps, the
error doesn't occur much as often and not under normal circumstances,
only when doing a real hard stress test.
Would it make sense to downgrade to 1.5 Gbps via a boot option?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 7:13 ` Tejun Heo
@ 2007-06-18 10:47 ` Florian Effenberger
0 siblings, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-06-18 10:47 UTC (permalink / raw)
To: Tejun Heo; +Cc: Mikael Pettersson, Tomi.Orava, jeff, linux-ide
Hi,
> Maybe we need to limit link speed to 1.5Gbps for these drives on
> sata_promise?
in our case, it's a
Vendor: ATA Model: WDC WD1600YS-01S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD3200YS-01P Rev: 21.0
Type: Direct-Access ANSI SCSI revision: 05
on a
00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub
(rev 02)
00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port
(rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI
Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller
(rev 02)
01:00.0 VGA compatible controller: nVidia Corporation Unknown device
016a (rev a1)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown
device 4364 (rev 12)
04:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363
AHCI Controller (rev 02)
04:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363
AHCI Controller (rev 02)
05:01.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 0c)
05:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23
IEEE-1394a-2000 Controller (PHY/Link)
Maybe blacklisting makes sense here, too?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 10:44 ` Florian Effenberger
@ 2007-06-18 10:56 ` Tejun Heo
2007-06-18 11:28 ` Florian Effenberger
2007-06-24 11:32 ` Florian Effenberger
0 siblings, 2 replies; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 10:56 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
> Hi,
>
>> The controller being ich8, I'm pretty sure it isn't a driver problem.
>
> I think so, too. The Intel chipsets have shown to be very good in the past.
>
>> Do the errors occur on all four drives? Also, if things work after
>> speed is downgraded to 1.5Gbps, it doesn't really matter. There's no
>> noticeable performance difference for single disk anyway.
>
> Yes, they do occur on all drives, as far as I know. With 1.5Gbps, the
> error doesn't occur much as often and not under normal circumstances,
> only when doing a real hard stress test.
Hmmm... Can you use a separate PSU to power two of the four drives and
see what happens? Just power up a PSU as directed in the following
webpage and connect two of the harddrives to the PSU.
http://modtown.co.uk/mt/article2.php?id=psumod
> Would it make sense to downgrade to 1.5 Gbps via a boot option?
I don't know. Till now all the problem cases have been isolated to a
specific controller / drive combination (sata_promise and newer seagate
drives) or hardware configuration problem (most of them being PSU
issues), so I don't think we need such option yet. If you have a
problematic hardware which pukes on 3.0Gbps, libata should do the right
thing after complaining a bit which IMHO isn't too bad.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 10:56 ` Tejun Heo
@ 2007-06-18 11:28 ` Florian Effenberger
2007-06-18 11:30 ` Tejun Heo
2007-06-24 11:32 ` Florian Effenberger
1 sibling, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-18 11:28 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi Tejun,
> Hmmm... Can you use a separate PSU to power two of the four drives and
> see what happens? Just power up a PSU as directed in the following
> webpage and connect two of the harddrives to the PSU.
>
> http://modtown.co.uk/mt/article2.php?id=psumod
thanks for that link, we will try that and keep you updated what happens!
> I don't know. Till now all the problem cases have been isolated to a
> specific controller / drive combination (sata_promise and newer seagate
> drives) or hardware configuration problem (most of them being PSU
> issues), so I don't think we need such option yet. If you have a
> problematic hardware which pukes on 3.0Gbps, libata should do the right
> thing after complaining a bit which IMHO isn't too bad.
So, loss of data or data corruption can't occur, even when we have to
wait until the speed is limited?
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 11:28 ` Florian Effenberger
@ 2007-06-18 11:30 ` Tejun Heo
2007-06-18 11:32 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-06-18 11:30 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
>> I don't know. Till now all the problem cases have been isolated to a
>> specific controller / drive combination (sata_promise and newer seagate
>> drives) or hardware configuration problem (most of them being PSU
>> issues), so I don't think we need such option yet. If you have a
>> problematic hardware which pukes on 3.0Gbps, libata should do the right
>> thing after complaining a bit which IMHO isn't too bad.
>
> So, loss of data or data corruption can't occur, even when we have to
> wait until the speed is limited?
Nope, there's nothing to worry about.
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 11:30 ` Tejun Heo
@ 2007-06-18 11:32 ` Florian Effenberger
0 siblings, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-06-18 11:32 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi,
> Nope, there's nothing to worry about.
okay, thanks a lot so far, it is good to know that developers are there
to help. ;-)
I will let you know how it turned out with the second PSU.
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 7:05 libata interface fatal error Mikael Pettersson
2007-06-18 7:13 ` Tejun Heo
@ 2007-06-18 17:14 ` Ansgar Knappheide
2007-06-18 18:54 ` Tomi Orava
2 siblings, 0 replies; 41+ messages in thread
From: Ansgar Knappheide @ 2007-06-18 17:14 UTC (permalink / raw)
To: linux-ide
Mikael Pettersson schrieb:
> On Mon, 18 Jun 2007 15:28:44 +0900, Tejun Heo wrote:
>
>> Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael
>> Pettersson and quoting whole body for him. Mikael, does this look familiar?
>>
>> Tomi Orava wrote:
>>
>>> Hi Tejun,
>>>
>>> I've been trying to find a solution for a long time for quite a similar
>>> libata errror messages as shown in this thread. Perhaps you might get have
>>> some ideas what the actual originator might be:
>>>
>>> With the latest 2.6.22-rc4-git4 kernel I still get the following error
>>> messages
>>> with high I/O load:
>>>
>>> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
>>> sd 2:0:0:0: [sdc] Write Protect is off
>>> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>>> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
>>> support DPO or FUA
>>> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
>>> ata3.00: (port_status 0x20080000)
>>> ata3.00: cmd c8/00:08:af:91:49/00:00:00:00:00/e5 tag 0 cdb 0x0 data 4096 in
>>> res 50/00:00:b6:91:49/00:00:11:00:00/e5 Emask 0x2 (HSM violation)
>>> ata3: soft resetting port
>>> ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
>>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
>>> ata3.00: configured for UDMA/133
>>> ata3: EH complete
>>>
>>> ... and later in the chain ...
>>>
>>> sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
>>> sd 2:0:0:0: [sdc] Write Protect is off
>>> sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>>> sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
>>> support DPO or FUA
>>> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
>>> ata3.00: (port_status 0x20080000)
>>> ata3.00: cmd c8/00:08:67:74:65/00:00:00:00:00/ec tag 0 cdb 0x0 data 4096 in
>>> res 50/00:00:6e:74:65/00:00:1b:00:00/ec Emask 0x2 (HSM violation)
>>> ata3: soft resetting port
>>> ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
>>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
>>> ata3.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
>>> ata3.00: configured for UDMA/100
>>> ata3: EH complete
>>>
>>> --- This goes on until UDMA/33 has been reched
>>>
> ...
>
>>> and the problems relate only to Seagate 7200.10 SATA-disks, never with the
>>> older 7200.7 SATA-disks alll connected to Promise Sata 300TX4-controller.
>>>
> ...
>
>>> PS. These problems are not special to this single machine as a friend at work
>>> has the same Promise Sata300TX4 card with exactly the same Seagate
>>> 7200.10
>>> SATA-disks on an intel-based P4 machine with similar problems under
>>> I/O-load.
>>>
>
> Yes, this is familiar. Several people have reported problems with
> Seagate's 7200.10 disks in 3Gbps operation on sata_promise.
> Unfortunately the error reports don't really give a clue as to what
> the root cause is.
>
> I used to be able to forcibly trigger similar errors with their
> 7200.9 disks, but I can't seem to do that any more.
>
>
Hello,
I'm jumping in this thread, because I'm seeing the same probleme on my
system with Promise SATAII 150 TX4 (PDC40518) and harddrive Maxtor
6L200M0 (BANC1E00) with following error
Jun 18 01:16:03 buffy kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x2
Jun 18 01:16:03 buffy kernel: ata1.00: (port_status 0x20080000)
Jun 18 01:16:03 buffy kernel: ata1.00: cmd
c8/00:15:e1:e3:16/00:00:00:00:00/e6 tag 0 cdb 0x0 data 10752 in
Jun 18 01:16:03 buffy kernel: res
50/00:00:f5:e3:16/00:00:00:00:00/e6 Emask 0x2 (HSM violation)
Jun 18 01:16:03 buffy kernel: ata1: soft resetting port
Jun 18 01:16:03 buffy kernel: ata1: SATA link up 1.5 Gbps (SStatus 113
SControl 300)
Jun 18 01:16:03 buffy kernel: ata1.00: ata_hpa_resize 1: sectors =
398297088, hpa_sectors = 398297088
Jun 18 01:16:03 buffy kernel: ata1.00: ata_hpa_resize 1: sectors =
398297088, hpa_sectors = 398297088
Jun 18 01:16:03 buffy kernel: ata1.00: configured for UDMA/133
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Result: hostbyte=0x00
driverbyte=0x08
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Sense Key : 0xb
[current] [descriptor]
Jun 18 01:16:03 buffy kernel: Descriptor sense data with sense
descriptors (in hex):
Jun 18 01:16:03 buffy kernel: 72 0b 00 00 00 00 00 0c 00 0a 80
00 00 00 00 00
Jun 18 01:16:03 buffy kernel: 06 16 e3 f5
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0
Jun 18 01:16:03 buffy kernel: end_request: I/O error, dev sda, sector
102163425
Jun 18 01:16:03 buffy kernel: ata1: EH complete
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] 398297088 512-byte
hardware sectors (203928 MB)
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Write Protect is off
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Jun 18 01:16:03 buffy kernel: sd 0:0:0:0: [sda] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
On normal use this error shows up only once a week, but when transfering
lot of data (> 100MB) to USB-Stick that error shows every few seconds
with only different values for data. When transfering data from
USB-Stick to harddrive no error shows.
Other information on my system:
smartctl -d sat -a /dev/sda
smartctl version 5.38 [i686-suse-linux] Copyright (C) 2002-7 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Model Family: Maxtor DiamondMax 10 family (ATA/133 and SATA/150)
Device Model: Maxtor 6L200M0
Serial Number: L40A4PDH
Firmware Version: BANC1E00
User Capacity: 203.928.109.056 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Mon Jun 18 19:11:50 2007 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Warning! SMART Attribute Thresholds Structure error: invalid SMART checksum.
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Disabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: (1562) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 81) minutes.
SCT capabilities: (0x0021) SCT Status supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0027 206 204 063 Pre-fail
Always - 10179
4 Start_Stop_Count 0x0032 253 253 000 Old_age
Always - 1502
5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail
Always - 0
6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail
Offline - 0
7 Seek_Error_Rate 0x000a 253 252 000 Old_age
Always - 0
8 Seek_Time_Performance 0x0027 246 240 187 Pre-fail
Always - 37304
9 Power_On_Minutes 0x0032 239 239 000 Old_age
Always - 539h+13m
10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail
Always - 0
11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 250 250 000 Old_age
Always - 1570
192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age
Always - 0
193 Load_Cycle_Count 0x0032 253 253 000 Old_age
Always - 0
194 Temperature_Celsius 0x0032 031 253 000 Old_age
Always - 33
195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age
Always - 9263
196 Reallocated_Event_Count 0x0008 253 253 000 Old_age
Offline - 0
197 Current_Pending_Sector 0x0008 253 253 000 Old_age
Offline - 0
198 Offline_Uncorrectable 0x0008 253 253 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age
Offline - 0
200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age
Always - 0
201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age
Always - 0
202 TA_Increase_Count 0x000a 253 252 000 Old_age
Always - 0
203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail
Always - 0
204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age
Always - 0
205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age
Always - 0
207 Spin_High_Current 0x002a 253 252 000 Old_age
Always - 0
208 Spin_Buzz 0x002a 253 252 000 Old_age
Always - 0
209 Offline_Seek_Performnce 0x0024 239 239 000 Old_age
Offline - 179
210 Unknown_Attribute 0x0032 253 252 000 Old_age
Always - 0
211 Unknown_Attribute 0x0032 253 252 000 Old_age
Always - 0
212 Unknown_Attribute 0x0032 253 252 000 Old_age
Always - 0
Warning! SMART ATA Error Log Structure error: invalid SMART checksum.
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00%
1163 -
# 2 Short offline Completed without error 00%
1163 -
# 3 Offline Aborted by host 70%
0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
hdparm -I /dev/sda
/dev/sda:
ATA device, with non-removable media
Model Number: Maxtor 6L200M0
Serial Number: L40A4PDH
Firmware Revision: BANC1E00
Standards:
Used: ATA/ATAPI-7 T13 1532D revision 0
Supported: 7 6 5 4
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 398297088
device size with M = 1024*1024: 194481 MBytes
device size with M = 1000*1000: 203928 MBytes (203 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Advanced power management level: unknown setting (0x0000)
Recommended acoustic management value: 192, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_VERIFY command
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Advanced Power Management feature set
SET_MAX security extension
* Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* SATA-I signaling speed (1.5Gb/s)
* Native Command Queueing (NCQ)
Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Checksum: correct
lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP]
Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
00:06.0 Ethernet controller: D-Link System Inc RTL8139 Ethernet (rev 10)
00:07.0 Mass storage controller: Promise Technology, Inc.
PDC20518/PDC40518 (SATAII 150 TX4) (rev 02)
00:0b.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 08)
00:0b.1 Input device controller: Creative Labs SB Live! Game Port (rev 08)
00:0c.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
00:11.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
01:00.0 VGA compatible controller: nVidia Corporation NV25 [GeForce4 Ti
4200] (rev a3)
Perhaps this will help to resolve the problem
Ansgar
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 7:05 libata interface fatal error Mikael Pettersson
2007-06-18 7:13 ` Tejun Heo
2007-06-18 17:14 ` Ansgar Knappheide
@ 2007-06-18 18:54 ` Tomi Orava
2 siblings, 0 replies; 41+ messages in thread
From: Tomi Orava @ 2007-06-18 18:54 UTC (permalink / raw)
Cc: htejun, florian, jeff, linux-ide, mikpe
> On Mon, 18 Jun 2007 15:28:44 +0900, Tejun Heo wrote:
>> Yeah, it seems promise has some problem with 3G link. Cc'ing Mikael
>> Pettersson and quoting whole body for him. Mikael, does this look
>> familiar?
>>
>> Tomi Orava wrote:
>> > Hi Tejun,
>> >
>> > I've been trying to find a solution for a long time for quite a
>> similar
>> > libata errror messages as shown in this thread. Perhaps you might get
>> have
>> > some ideas what the actual originator might be:
>> >
>> > With the latest 2.6.22-rc4-git4 kernel I still get the following error
>> > messages
>> > with high I/O load:
<snip>
>> > and the problems relate only to Seagate 7200.10 SATA-disks, never with
>> the
>> > older 7200.7 SATA-disks alll connected to Promise Sata
>> 300TX4-controller.
> ...
>> > PS. These problems are not special to this single machine as a friend
>> at work
>> > has the same Promise Sata300TX4 card with exactly the same
>> Seagate
>> > 7200.10
>> > SATA-disks on an intel-based P4 machine with similar problems
>> under
>> > I/O-load.
>
> Yes, this is familiar. Several people have reported problems with
> Seagate's 7200.10 disks in 3Gbps operation on sata_promise.
> Unfortunately the error reports don't really give a clue as to what
> the root cause is.
>
> I used to be able to forcibly trigger similar errors with their
> 7200.9 disks, but I can't seem to do that any more.
Hmm, are you really sure that this is 3Gbps mode related ?
I'm wondering about that as the problem is there no matter if the
1.5Gbps jumper is set on the 7200.10 disks or not. Also I retested
your older sata_promise 1.5Gbps speed limit patch and it did not
fix the problem. This is really strange!
I've now connected the problematic two 7200.10 disks into Via VT6420
controller and the problem has been fixed for me (for now). It would be great
to figure out what is the actual problem here though ...
Regards,
Tomi Orava
--
Tomi.Orava@ncircle.nullnet.fi
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-18 10:56 ` Tejun Heo
2007-06-18 11:28 ` Florian Effenberger
@ 2007-06-24 11:32 ` Florian Effenberger
2007-06-25 2:49 ` Tejun Heo
1 sibling, 1 reply; 41+ messages in thread
From: Florian Effenberger @ 2007-06-24 11:32 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi there,
sorry, it seems it was all a false alert and our mainboard was
defective. At the end, it turned on only sometimes. To test it, we
wanted to install Windows, which didn't work as well.
Now the dealer changed the motherboard, and we are just fine with 3.0
Gbps and Kernel 2.6.21.5.
Sorry for the big confusion and for your great help! I didn't know the
board was defective in the first place, there have been no indications
like that...
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-24 11:32 ` Florian Effenberger
@ 2007-06-25 2:49 ` Tejun Heo
2007-06-25 8:47 ` Florian Effenberger
0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2007-06-25 2:49 UTC (permalink / raw)
To: Florian Effenberger; +Cc: linux-ide, jeff
Florian Effenberger wrote:
> sorry, it seems it was all a false alert and our mainboard was
> defective. At the end, it turned on only sometimes. To test it, we
> wanted to install Windows, which didn't work as well.
>
> Now the dealer changed the motherboard, and we are just fine with 3.0
> Gbps and Kernel 2.6.21.5.
>
> Sorry for the big confusion and for your great help! I didn't know the
> board was defective in the first place, there have been no indications
> like that...
Yeah, things like these are tricky. SATA is usually the first one to
suffer from hardware defect including power fluctuation due to input
power, PSU or on-board voltage regulator problems because the link is
relatively long and runs at very high speed. I also heard that SATA
cables should have been made more resistant to interference but I'm no
expert in that area.
It's interesting to see how it got solved. Thanks for another data
point to blame hardware when I don't have a clue. :-)
--
tejun
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: libata interface fatal error
2007-06-25 2:49 ` Tejun Heo
@ 2007-06-25 8:47 ` Florian Effenberger
0 siblings, 0 replies; 41+ messages in thread
From: Florian Effenberger @ 2007-06-25 8:47 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, jeff
Hi Tejun,
> Yeah, things like these are tricky. SATA is usually the first one to
> suffer from hardware defect including power fluctuation due to input
> power, PSU or on-board voltage regulator problems because the link is
> relatively long and runs at very high speed. I also heard that SATA
> cables should have been made more resistant to interference but I'm no
> expert in that area.
me neither. I first thought of a driver issue, because the machine just
ran fine and started to have mysterious effects some weeks later...
> It's interesting to see how it got solved. Thanks for another data
> point to blame hardware when I don't have a clue. :-)
Hehe, you're welcome. ;-)
Thanks for all your efforts, I really appreciate them!
Florian
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2007-06-25 8:47 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-18 7:05 libata interface fatal error Mikael Pettersson
2007-06-18 7:13 ` Tejun Heo
2007-06-18 10:47 ` Florian Effenberger
2007-06-18 17:14 ` Ansgar Knappheide
2007-06-18 18:54 ` Tomi Orava
-- strict thread matches above, loose matches on Subject: below --
2007-05-26 9:43 Florian Effenberger
2007-05-29 9:16 ` Tejun Heo
2007-05-29 14:16 ` Florian Effenberger
2007-06-06 21:23 ` Florian Effenberger
2007-06-07 9:50 ` Tejun Heo
2007-06-07 14:08 ` Florian Effenberger
2007-06-13 10:37 ` Florian Effenberger
2007-06-14 9:43 ` Tejun Heo
2007-06-14 11:12 ` Florian Effenberger
2007-06-14 12:25 ` Tejun Heo
2007-06-14 15:12 ` Florian Effenberger
2007-06-18 3:10 ` Tejun Heo
2007-06-18 6:08 ` Tomi Orava
2007-06-18 6:28 ` Tejun Heo
2007-06-18 10:38 ` Florian Effenberger
2007-06-18 10:44 ` Tejun Heo
2007-06-16 10:23 ` Florian Effenberger
2007-06-18 3:13 ` Tejun Heo
2007-06-18 10:44 ` Florian Effenberger
2007-06-18 10:56 ` Tejun Heo
2007-06-18 11:28 ` Florian Effenberger
2007-06-18 11:30 ` Tejun Heo
2007-06-18 11:32 ` Florian Effenberger
2007-06-24 11:32 ` Florian Effenberger
2007-06-25 2:49 ` Tejun Heo
2007-06-25 8:47 ` Florian Effenberger
2007-05-24 13:25 Florian Effenberger
2007-05-24 13:45 ` Tejun Heo
2007-05-24 14:08 ` Florian Effenberger
2007-05-24 14:21 ` Tejun Heo
2007-05-24 14:47 ` Florian Effenberger
2007-05-24 14:53 ` Tejun Heo
2007-05-24 15:28 ` Florian Effenberger
2007-05-24 14:55 ` Greg Freemyer
2007-05-24 14:59 ` Tejun Heo
2007-05-24 15:00 ` Florian Effenberger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).