From: Tejun Heo <htejun@gmail.com>
To: Denys Dmytriyenko <denis@denix.org>
Cc: Mark Lord <liml@rtr.ca>, Gabor FUNK <FUNK.Gabor@hunetkft.hu>,
linux-ide@vger.kernel.org, Jim Paris <jim@jtan.com>
Subject: Re: sata_sil24 stability and performance
Date: Tue, 18 Mar 2008 15:40:01 +0900 [thread overview]
Message-ID: <47DF63C1.5090205@gmail.com> (raw)
In-Reply-To: <20080318045316.GA3959@denix.org>
Hello,
Denys Dmytriyenko wrote:
>> Hmmm... This is first. Which driver is it? It means that controller is
>> reporting that NCQ command tags which are not issued (or already
>> completed) are in-flight. Due to the way hdd reports NCQ command
>> completion, it's not possible for the drive to cause this. This gotta
>> be a bug on the host side (be it controller chip or more likely the
>> driver). The command tag in question is 5. Only 0, 3 and 4 were in flight.
>
> It is sata_sil24 on 2.6.23.9. If there were related fixes in the recent
> versions, I can retest it.
No, not that I know of.
>> This one is different. The drive reported device error but the driver
>> couldn't get more information about the error (log page 10h contains
>> it). What does smartctl -a on the drive say?
>
> # smartctl -a /dev/sdc
> smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> 9 Power_On_Hours 0x0032 242 242 000 Old_age Always - 3941
Okay, power on hours is 3941.
> Error 42 occurred at disk power-on lifetime: 3444 hours (143 days + 12 hours)
> When the command that caused the error occurred, the device was in an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 84 41 28 ff 46 5a 40
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 60 08 28 ff 46 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED
> 60 08 28 ff 46 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED
> 60 08 28 ff 46 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED
> 60 10 20 2f 47 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED
> 60 08 18 1f 47 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED
Error 42 occurred about 21days ago. Unless your clock is off, I don't
think this is what you've seen but the error is UNC (uncorrectable media
error), so it does mean that your drive has some bad sectors which can
explain the device error you saw.
> Error 41 occurred at disk power-on lifetime: 3405 hours (141 days + 21 hours)
> When the command that caused the error occurred, the device was in an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 00 41 01 10 00 00 a0 Error:
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 2f 00 01 10 00 00 a0 00 12:51:00.112 READ LOG EXT
> 60 20 20 7f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED
> 60 08 18 6f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED
> 60 30 10 9f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED
> 60 08 08 5f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED
Hmm.. this one less clear. Maybe the device wasn't expecting READ LOG
EXT as it was still in NCQ command phase and got surprised?
Currently you're the first and only one to report illegal qc_active
transition problem. I'd like to know what precedes the error which
isn't exactly easy in retrospect. For now, please keep an eye on those
errors and report if you can see any pattern. And just in case, can you
get 2.6.24 on the machine and see anything changes?
--
tejun
next prev parent reply other threads:[~2008-03-18 6:40 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-19 2:09 sata_sil24 stability and performance Denys Dmytriyenko
2008-02-19 4:36 ` Jim Paris
2008-02-19 6:39 ` Denys Dmytriyenko
2008-02-19 15:32 ` Mark Lord
2008-03-02 6:14 ` Denys Dmytriyenko
2008-03-02 9:39 ` Gabor FUNK
2008-03-04 0:02 ` Tejun Heo
2008-03-04 0:22 ` Denys Dmytriyenko
2008-03-04 3:28 ` Tejun Heo
2008-03-04 6:29 ` Denys Dmytriyenko
2008-03-05 8:11 ` Tejun Heo
2008-03-06 4:14 ` Denys Dmytriyenko
2008-03-06 4:25 ` Tejun Heo
2008-03-06 6:55 ` Denys Dmytriyenko
2008-03-06 7:08 ` Tejun Heo
2008-03-15 21:43 ` Denys Dmytriyenko
2008-03-17 3:09 ` Mark Lord
2008-03-18 0:15 ` Denys Dmytriyenko
2008-03-18 4:09 ` Tejun Heo
2008-03-18 4:53 ` Denys Dmytriyenko
2008-03-18 6:40 ` Tejun Heo [this message]
2008-03-20 22:37 ` Denys Dmytriyenko
2008-03-21 0:18 ` Tejun Heo
2008-04-14 1:19 ` Denys Dmytriyenko
2008-04-14 2:49 ` Tejun Heo
2008-04-14 10:55 ` Gabor FUNK
2008-03-18 9:14 ` Gabor FUNK
2008-03-18 13:06 ` Gabor FUNK
2008-03-18 20:05 ` Mark Lord
2008-03-18 20:06 ` Mark Lord
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47DF63C1.5090205@gmail.com \
--to=htejun@gmail.com \
--cc=FUNK.Gabor@hunetkft.hu \
--cc=denis@denix.org \
--cc=jim@jtan.com \
--cc=liml@rtr.ca \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).