* ATA resets with Intel 8/C220 and HGST drive
@ 2015-04-27 13:03 Nicolas George
2015-04-27 19:41 ` Dan Ritter
2015-04-29 19:59 ` Selim T. Erdoğan
0 siblings, 2 replies; 3+ messages in thread
From: Nicolas George @ 2015-04-27 13:03 UTC (permalink / raw)
To: debian-user, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4784 bytes --]
Summary: I had annoying resets of the SATA bus with a 8 Series/C220 Series
Chipset controller and a HGST Travelstar 7K1000 drive. I recently managed to
stop them and as far as I currently know I am satisfied; I write this mail
in the hope that it may be useful for anyone having similar issues. If you
do not have that issue and you are not a developer interested in fixing the
issue more permanently, you can stop reading right now.
Here are the details. The computer is a Zotac ZBox ID91 nettop with a
proprietary motherboard, and, as stated above, a Travelstar 7K1000 hard
drive (a 7200 RPM 2.5", an unusual beast). It was installed around June
2014, and I noticed the problems some time later, they probably started
right away.
The distribution was a Debian Jessie (testing) with the packaged kernel,
probably linux-image-3.14-1-amd64:amd64 at the time; the issue was not fixed
by upgrades.
The possibly relevant hardware information are these:
CPU: Intel(R) Core(TM) i3-4130T CPU @ 2.90GHz
CPU:
product: Intel(R) Core(TM) i3-4130T CPU @ 2.90GHz
description: SATA controller
product: 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode]
vendor: Intel Corporation
physical id: 1f.2
bus info: pci@0000:00:1f.2
version: 05
width: 32 bits
clock: 66MHz
capabilities: storage msi pm ahci_1.0 bus_master cap_list
configuration: driver=ahci latency=0
resources: irq:42 ioport:f0b0(size=8) ioport:f0a0(size=4) ioport:f090(size=8) ioport:f080(size=4) ioport:f060(size=32) memory:f7d1a000-f7d1a7ff
description: ATA Disk
product: HGST HTS721010A9
physical id: 0.0.0
bus info: scsi@1:0.0.0
logical name: /dev/sda
version: A3J0
serial: [REMOVED]
size: 931GiB (1TB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096 signature=d3079a6d
The resets happened a few times a day (this computer was is kept on for more
than a day and suspend is not used), mostly when the disk was in heavy use,
sometimes as early as during the boot; there was a few good days when they
did not happen. They were annoying because they caused a few seconds freeze
of anything reading from disk; AFAIK they never resulted in data corruption.
The corresponding kernel messages look like this:
[ 337.466498] ata2: EH complete
[ 367.251032] ata2.00: exception Emask 0x10 SAct 0x80000 SErr 0x400100 action 0x6 frozen
[ 367.251041] ata2.00: irq_stat 0x08000000, interface fatal error
[ 367.251046] ata2: SError: { UnrecovData Handshk }
[ 367.251053] ata2.00: failed command: WRITE FPDMA QUEUED
[ 367.251063] ata2.00: cmd 61/08:98:68:3b:40/00:00:6b:00:00/40 tag 19 ncq 4096 out
[ 367.251063] res 50/00:08:68:3b:40/00:00:6b:00:00/40 Emask 0x10 (ATA bus error)
[ 367.251068] ata2.00: status: { DRDY }
[ 367.251075] ata2: hard resetting link
[ 367.571128] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 367.577660] ata2.00: configured for UDMA/133
[ 367.577676] ata2: EH complete
[ 409.772730] ata2: limiting SATA link speed to 3.0 Gbps
[ 409.772735] ata2.00: exception Emask 0x10 SAct 0x3fe00 SErr 0x400100 action 0x6 frozen
[ 409.772736] ata2.00: irq_stat 0x08000000, interface fatal error
[ 409.772737] ata2: SError: { UnrecovData Handshk }
[ 409.772739] ata2.00: failed command: READ FPDMA QUEUED
[ 409.772742] ata2.00: cmd 60/08:48:78:09:41/00:00:01:00:00/40 tag 9 ncq 4096 in
[ 409.772742] res 50/00:28:e0:a3:04/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 409.772743] ata2.00: status: { DRDY }
<snip seven similar "failed command...DRDY" blocks>
[ 409.772773] ata2.00: failed command: WRITE FPDMA QUEUED
[ 409.772776] ata2.00: cmd 61/28:88:e0:a3:04/00:00:02:00:00/40 tag 17 ncq 20480 out
[ 409.772776] res 50/00:28:e0:a3:04/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 409.772777] ata2.00: status: { DRDY }
[ 409.772779] ata2: hard resetting link
[ 410.092732] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 410.097670] ata2.00: configured for UDMA/133
Last week, hinted by the penultimate line, I tried to lower the speed of the
SATA link permanently, and it worked. I did this by adding
"libata.force=2:3.0Gbps" to the kernel command line (configured using
/etc/default/grub).
Since then, no reset happened; I am confident that seven days without them
are not a coincidence.
As I said, I consider the issue closed from my point of view. If someone
wants to investigate further (for example a kernel hacker to actually fix
this, or a distro developer to make an automatic work-around), I can give
some more details, and possibly run a few tests if they do not take much
time and are not too risky.
Hope this helps.
Regards,
--
Nicolas George
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: ATA resets with Intel 8/C220 and HGST drive 2015-04-27 13:03 ATA resets with Intel 8/C220 and HGST drive Nicolas George @ 2015-04-27 19:41 ` Dan Ritter 2015-04-29 19:59 ` Selim T. Erdoğan 1 sibling, 0 replies; 3+ messages in thread From: Dan Ritter @ 2015-04-27 19:41 UTC (permalink / raw) To: Nicolas George; +Cc: debian-user, linux-kernel On Mon, Apr 27, 2015 at 03:03:58PM +0200, Nicolas George wrote: > [ 409.772773] ata2.00: failed command: WRITE FPDMA QUEUED > [ 409.772776] ata2.00: cmd 61/28:88:e0:a3:04/00:00:02:00:00/40 tag 17 ncq 20480 out > [ 409.772776] res 50/00:28:e0:a3:04/00:00:02:00:00/40 Emask 0x10 (ATA bus error) > [ 409.772777] ata2.00: status: { DRDY } > [ 409.772779] ata2: hard resetting link > [ 410.092732] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320) > [ 410.097670] ata2.00: configured for UDMA/133 > > Last week, hinted by the penultimate line, I tried to lower the speed of the > SATA link permanently, and it worked. I did this by adding > "libata.force=2:3.0Gbps" to the kernel command line (configured using > /etc/default/grub). > > Since then, no reset happened; I am confident that seven days without them > are not a coincidence. Two options occur to me: 1. There may be a firmware update for your disk. 2. You may have a bad SATA cable. -dsr- ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ATA resets with Intel 8/C220 and HGST drive 2015-04-27 13:03 ATA resets with Intel 8/C220 and HGST drive Nicolas George 2015-04-27 19:41 ` Dan Ritter @ 2015-04-29 19:59 ` Selim T. Erdoğan 1 sibling, 0 replies; 3+ messages in thread From: Selim T. Erdoğan @ 2015-04-29 19:59 UTC (permalink / raw) To: Nicolas George; +Cc: debian-user, linux-kernel On Mon, Apr 27, 2015 at 03:03:58PM +0200, Nicolas George wrote: > Summary: I had annoying resets of the SATA bus with a 8 Series/C220 Series > Chipset controller and a HGST Travelstar 7K1000 drive. I recently managed to > stop them and as far as I currently know I am satisfied; I write this mail > in the hope that it may be useful for anyone having similar issues. If you > do not have that issue and you are not a developer interested in fixing the > issue more permanently, you can stop reading right now. > > Here are the details. The computer is a Zotac ZBox ID91 nettop with a > proprietary motherboard, and, as stated above, a Travelstar 7K1000 hard > drive (a 7200 RPM 2.5", an unusual beast). It was installed around June > 2014, and I noticed the problems some time later, they probably started > right away. > > The distribution was a Debian Jessie (testing) with the packaged kernel, > probably linux-image-3.14-1-amd64:amd64 at the time; the issue was not fixed > by upgrades. > > The possibly relevant hardware information are these: > > CPU: Intel(R) Core(TM) i3-4130T CPU @ 2.90GHz > > CPU: > product: Intel(R) Core(TM) i3-4130T CPU @ 2.90GHz > > description: SATA controller > product: 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] > vendor: Intel Corporation > physical id: 1f.2 > bus info: pci@0000:00:1f.2 > version: 05 > width: 32 bits > clock: 66MHz > capabilities: storage msi pm ahci_1.0 bus_master cap_list > configuration: driver=ahci latency=0 > resources: irq:42 ioport:f0b0(size=8) ioport:f0a0(size=4) ioport:f090(size=8) ioport:f080(size=4) ioport:f060(size=32) memory:f7d1a000-f7d1a7ff > > description: ATA Disk > product: HGST HTS721010A9 > physical id: 0.0.0 > bus info: scsi@1:0.0.0 > logical name: /dev/sda > version: A3J0 > serial: [REMOVED] > size: 931GiB (1TB) > capabilities: partitioned partitioned:dos > configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096 signature=d3079a6d > > The resets happened a few times a day (this computer was is kept on for more > than a day and suspend is not used), mostly when the disk was in heavy use, > sometimes as early as during the boot; there was a few good days when they > did not happen. They were annoying because they caused a few seconds freeze > of anything reading from disk; AFAIK they never resulted in data corruption. > > The corresponding kernel messages look like this: > > [ 337.466498] ata2: EH complete > [ 367.251032] ata2.00: exception Emask 0x10 SAct 0x80000 SErr 0x400100 action 0x6 frozen > [ 367.251041] ata2.00: irq_stat 0x08000000, interface fatal error > [ 367.251046] ata2: SError: { UnrecovData Handshk } > [ 367.251053] ata2.00: failed command: WRITE FPDMA QUEUED > [ 367.251063] ata2.00: cmd 61/08:98:68:3b:40/00:00:6b:00:00/40 tag 19 ncq 4096 out > [ 367.251063] res 50/00:08:68:3b:40/00:00:6b:00:00/40 Emask 0x10 (ATA bus error) > [ 367.251068] ata2.00: status: { DRDY } > [ 367.251075] ata2: hard resetting link > [ 367.571128] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > [ 367.577660] ata2.00: configured for UDMA/133 > [ 367.577676] ata2: EH complete > [ 409.772730] ata2: limiting SATA link speed to 3.0 Gbps > [ 409.772735] ata2.00: exception Emask 0x10 SAct 0x3fe00 SErr 0x400100 action 0x6 frozen > [ 409.772736] ata2.00: irq_stat 0x08000000, interface fatal error > [ 409.772737] ata2: SError: { UnrecovData Handshk } > [ 409.772739] ata2.00: failed command: READ FPDMA QUEUED > [ 409.772742] ata2.00: cmd 60/08:48:78:09:41/00:00:01:00:00/40 tag 9 ncq 4096 in > [ 409.772742] res 50/00:28:e0:a3:04/00:00:02:00:00/40 Emask 0x10 (ATA bus error) > [ 409.772743] ata2.00: status: { DRDY } > <snip seven similar "failed command...DRDY" blocks> > [ 409.772773] ata2.00: failed command: WRITE FPDMA QUEUED > [ 409.772776] ata2.00: cmd 61/28:88:e0:a3:04/00:00:02:00:00/40 tag 17 ncq 20480 out > [ 409.772776] res 50/00:28:e0:a3:04/00:00:02:00:00/40 Emask 0x10 (ATA bus error) > [ 409.772777] ata2.00: status: { DRDY } > [ 409.772779] ata2: hard resetting link > [ 410.092732] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320) > [ 410.097670] ata2.00: configured for UDMA/133 > > Last week, hinted by the penultimate line, I tried to lower the speed of the > SATA link permanently, and it worked. I did this by adding > "libata.force=2:3.0Gbps" to the kernel command line (configured using > /etc/default/grub). > > Since then, no reset happened; I am confident that seven days without them > are not a coincidence. I had a similar experience with a Sony Vaio VGN-NS140 laptop (from 2008) when its hard drive died a few years ago. The replacement drives (new or used) that I tried would work for a little while, usually long enough to install Debian, but would get corrupted within a few hours. I would see messages like yours above, about going to a lower SATA speed. From 3.0Gbps to 1.5 Gbps in my case. But that wouldn't keep the drive from getting corrupted. (Maybe it was trying to auto-negotiate back to a higher speed, I don't remember.) I finally solved it like you, by permanently setting the libata.force option to 1.5Gbps. It worked, but the new replacement drive I had bought was an SSD, so I was a little unhappy I had to use it at the lower speed. In my case, the original hard drive that came out of the machine, a Seagate Momentus, had a jumper which set the maximum speed to 1.5Gbps. Presumably, Sony knew that the machine wasn't able to handle higher speeds or auto-negotiation of the speed, so they set that jumper. However, the replacement drives I tried didn't have such speed-limiting options, so I had to set it in the kernel module option. (BTW, a few months ago I bought a used Thinkpad which came with a Seagate Momentus in it so I was able to set the jumper and stick that drive in the Sony, freeing up my SSD for use in the Thinkpad, at its "unreduced" speed.) > > As I said, I consider the issue closed from my point of view. If someone > wants to investigate further (for example a kernel hacker to actually fix > this, or a distro developer to make an automatic work-around), I can give > some more details, and possibly run a few tests if they do not take much > time and are not too risky. > > Hope this helps. > > Regards, > > -- > Nicolas George ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-04-29 20:00 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-27 13:03 ATA resets with Intel 8/C220 and HGST drive Nicolas George 2015-04-27 19:41 ` Dan Ritter 2015-04-29 19:59 ` Selim T. Erdoğan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox