linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SATA errors
@ 2009-08-09 10:36 Miah Gregory
  2009-08-09 17:24 ` Robert Hancock
  2009-08-09 19:57 ` Mikael Pettersson
  0 siblings, 2 replies; 5+ messages in thread
From: Miah Gregory @ 2009-08-09 10:36 UTC (permalink / raw)
  To: linux-ide

Hi,

One of my servers has started to log slightly odd errors following one
of the software RAID arrays having been degraded due to an error on sdb.

ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
         res 40/00:00:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4: hard resetting link
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
sd 3:0:0:0: [sdb] 976773168 512-byte hardware sectors: (500 GB/465 GiB)
end_request: I/O error, dev sdb, sector 976767834
md: super_written gets error=-5, uptodate=0
raid1: Disk failure on sdb3, disabling device.
raid1: Operation continuing on 1 devices.
sd 3:0:0:0: [sdb] Write Protect is off
sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
RAID1 conf printout:
 --- wd:1 rd:2
 disk 0, wo:0, o:1, dev:sda3
 disk 1, wo:1, o:0, dev:sdb3
RAID1 conf printout:
 --- wd:1 rd:2
 disk 0, wo:0, o:1, dev:sda3

I tried running various smart tests and have run badblocks in R/W mode
across the whole surface of sdb, but did not find any obvious cause.

Do the following error logs point at anything specifically, all of which
have been seen since the above:

-snip-
ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata4.00: cmd b0/d5:01:09:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
         res 40/00:00:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4: hard resetting link
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
sd 3:0:0:0: [sdb] 976773168 512-byte hardware sectors: (500 GB/465 GiB)
sd 3:0:0:0: [sdb] Write Protect is off
sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
-snip-

-snip-
ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata4.00: cmd b0/d5:01:09:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
         res 40/00:00:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4: hard resetting link
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
sd 3:0:0:0: [sdb] 976773168 512-byte hardware sectors: (500 GB/465 GiB)
sd 3:0:0:0: [sdb] Write Protect is off
sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
-snip-

-snip-
ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata3.00: cmd b0/d5:01:09:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
         res 40/00:00:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
sd 2:0:0:0: [sda] 976773168 512-byte hardware sectors: (500 GB/465 GiB)
sd 2:0:0:0: [sda] Write Protect is off
sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
-snip-

-snip-
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd b0/d5:01:09:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
         res 40/00:00:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: hard resetting link
ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: configured for UDMA/133
sd 4:0:0:0: [sdc] 976773168 512-byte hardware sectors: (500 GB/465 GiB)
sd 4:0:0:0: [sdc] Write Protect is off
sd 4:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
-snip-

All of these errors might possibly related to smart self checking which
I've now set to run more regularly - although I wouldn't really expect a
reset being required to get the disk to respond during one.

I'm running a 2.6.28.2 kernel in a machine with 4 WDC WD5000AACS-0 500GB
disks.

I'm unable to check how the disks are physically wired at the moment,
however they're probably all on the promise controller:

00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
01:04.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02)

Any assistance greatfully received - please let me know if further
information is needed.

-- 
Miah Gregory


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-08-09 21:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-09 10:36 SATA errors Miah Gregory
2009-08-09 17:24 ` Robert Hancock
2009-08-09 19:57 ` Mikael Pettersson
2009-08-09 21:11   ` Miah Gregory
2009-08-09 21:35     ` Mikael Pettersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).