From: Thomas Fjellstrom <tfjellstrom@shaw.ca>
To: linux-kernel@vger.kernel.org
Subject: Re: sata exception frozen timeout?
Date: Tue, 1 Sep 2009 04:28:18 -0600 [thread overview]
Message-ID: <200909010428.18403.tfjellstrom@shaw.ca> (raw)
In-Reply-To: <200909010316.58715.tfjellstrom@shaw.ca>
On Tue September 1 2009, Thomas Fjellstrom wrote:
> I've recently started seeing the following messages from a new WD Green
> drive, not long after I had to RMA a Seagate drive for a strange (but
> different) error. Highlights from my dmesg include:
>
> [59634.792101] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [59634.792121] ata6.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
> [59634.792125] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4
> (timeout)
> [59634.792133] ata6.00: status: { DRDY }
> [59634.792145] ata6: hard resetting link
> [59635.277695] ata6: softreset failed (device not ready)
> [59635.277705] ata6: failed due to HW bug, retry pmp=0
> [59635.441570] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [59635.454030] ata6.00: configured for UDMA/133
> [59635.454068] ata6: EH complete
> ...
> [59713.792224] ata6.00: NCQ disabled due to excessive errors
> [59713.792236] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [59713.792254] ata6.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
> [59713.792257] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4
> (timeout)
> [59713.792264] ata6.00: status: { DRDY }
> ...
> [63484.781764] ata6: hard resetting link
> [63485.264204] ata6: softreset failed (device not ready)
> [63485.264214] ata6: failed due to HW bug, retry pmp=0
> [63485.428205] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [63485.440499] ata6.00: configured for UDMA/100
> [63485.440536] ata6: EH complete
> [63493.780182] ata6.00: limiting speed to UDMA/33:PIO4
> [63493.780194] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [63493.780211] ata6.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
> [63493.780214] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4
> (timeout)
> [63493.780221] ata6.00: status: { DRDY }
> ...
> [99318.070669] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> [99318.070679] ata6.00: irq_stat 0x40000001
> [99318.070698] ata6.00: cmd 25/00:00:30:bd:01/00:01:1b:00:00/e0 tag 0 dma
> 131072 in
> [99318.070701] res 41/04:ef:30:bd:01/00:00:1b:00:00/e0 Emask 0x1
> (device error)
> [99318.070709] ata6.00: status: { DRDY ERR }
> [99318.070714] ata6.00: error: { ABRT }
> [99318.082588] ata6.00: configured for UDMA/33
> [99318.082612] ata6: EH complete
> ...
> [99339.780358] ata6: hard resetting link
> [99349.785096] ata6: softreset failed (device not ready)
> [99349.785108] ata6: hard resetting link
> [99359.789033] ata6: softreset failed (device not ready)
> [99359.789045] ata6: hard resetting link
> [99370.352916] ata6: link is slow to respond, please be patient (ready=0)
> ..
> [99400.016925] ata6: softreset failed (device not ready)
> [99400.016936] ata6: reset failed, giving up
> [99400.016942] ata6.00: disabled
> [99400.016986] ata6: EH complete
> [99400.017021] sd 5:0:0:0: [sdf] Unhandled error code
> [99400.017026] sd 5:0:0:0: [sdf] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK
> [99400.017021] sd 5:0:0:0: [sdf] Unhandled error code
> [99400.017026] sd 5:0:0:0: [sdf] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK
> [99400.017035] end_request: I/O error, dev sdf, sector 453099568
> [99400.017045] Buffer I/O error on device md1, logical block 113274390
> [99400.017058] Buffer I/O error on device md1, logical block 113274391
> [99400.017064] Buffer I/O error on device md1, logical block 113274392
> [99400.017070] Buffer I/O error on device md1, logical block 113274393
> [99400.017076] Buffer I/O error on device md1, logical block 113274394
> [99400.017082] Buffer I/O error on device md1, logical block 113274395
> [99400.017088] Buffer I/O error on device md1, logical block 113274396
> [99400.017094] Buffer I/O error on device md1, logical block 113274397
> [99400.017100] Buffer I/O error on device md1, logical block 113274398
> [99400.017106] Buffer I/O error on device md1, logical block 113274399
> [99400.017115] sd 5:0:0:0: [sdf] Unhandled error code
> [99400.017123] sd 5:0:0:0: [sdf] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK
> [99400.017133] end_request: I/O error, dev sdf, sector 453099824
> [99400.017310] sd 5:0:0:0: [sdf] Unhandled error code
> [99400.017315] sd 5:0:0:0: [sdf] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK
> [99400.017322] end_request: I/O error, dev sdf, sector 453099568
> [99400.017383] program smartctl is using a deprecated SCSI ioctl, please
> convert it to SG_IO
> [99400.017456] program smartctl is using a deprecated SCSI ioctl, please
> convert it to SG_IO
> [99400.018822] sd 5:0:0:0: [sdf] Unhandled error code
> [99400.018829] sd 5:0:0:0: [sdf] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK
> [99400.018838] end_request: I/O error, dev sdf, sector 453099568
>
> and a bunch more end_request errors. This happens fairly reliably after a
> day or so of being online. I try and start to play some music or watch a
> video off the array (2xWD Green on md raid0) and it fails. The file system
> is mostly read only right now (that is I don't write to it) so I haven't
> manged to loose any data that I know of, and the fs seems to be fine. A
> reboot fixes the errors and the disk and fs work fine for a day or two
> before these errors show up again.
>
> I'm currently running 2.6.31-rc5 from kernel.org. What would cause these
> errors?
It's now happened again after only 30 minutes or so. I'm trying with an older
kernel (debian's 2.6.30-6 from sid) now. So far so good.
--
Thomas Fjellstrom
tfjellstrom@shaw.ca
next prev parent reply other threads:[~2009-09-01 10:28 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-01 9:16 sata exception frozen timeout? Thomas Fjellstrom
2009-09-01 10:28 ` Thomas Fjellstrom [this message]
2009-09-01 20:34 ` Thomas Fjellstrom
2009-09-02 4:32 ` Robert Hancock
2009-09-02 6:42 ` Thomas Fjellstrom
2009-09-03 19:38 ` BIOS update == more errors (was Re: sata exception frozen timeout?) Thomas Fjellstrom
2009-09-03 20:11 ` Thomas Fjellstrom
2009-09-03 22:54 ` Thomas Fjellstrom
2009-09-04 3:43 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200909010428.18403.tfjellstrom@shaw.ca \
--to=tfjellstrom@shaw.ca \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.