From: "Martin Ammermüller" <tenco@gmx.de>
To: Tejun Heo <htejun@gmail.com>
Cc: linux-ide@vger.kernel.org, jgarzik@pobox.com
Subject: Re: [sata_sil] kernel 2.6.17(-mm2) test - timeout issue
Date: Sat, 05 Aug 2006 15:36:45 +0200 [thread overview]
Message-ID: <1154785005.9220.1.camel@localhost> (raw)
In-Reply-To: <44CD1512.1060802@gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 4338 bytes --]
Am Montag, den 31.07.2006, 05:22 +0900 schrieb Tejun Heo:
> Martin Ammermüller wrote:
> > With high disk I/O and a 2.6.18-rc1 kernel i get these errors (depending
> > upon the work i do, up to several times a day):
> >
> > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x400000 action 0x2 frozen
> > ata1.00: (BMDMA stat 0x20)
> > ata1.00: tag 0 cmd 0xc8 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
>
> Hmm... Interesting. It gets HSM violation first.
>
> > ata1: soft resetting port
> > ata1: port is slow to respond, please be patient
> > ata1: port failed to respond (30 secs)
> > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > ATA: abnormal status 0xD8 on port 0xDCA18087
> > ATA: abnormal status 0xD8 on port 0xDCA18087
> > ATA: abnormal status 0xD8 on port 0xDCA18087
> > ATA: abnormal status 0xD8 on port 0xDCA18087
> > ATA: abnormal status 0xD8 on port 0xDCA18087
> > ata1.00: qc timeout (cmd 0xec)
> > ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> > ata1.00: revalidation failed (errno=-5)
> > ata1: failed to recover some devices, retrying in 5 secs
> > ata1: hard resetting port
> > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata1: EH complete
>
> Then two timeouts while recovering.
>
> > SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
> > sda: Write Protect is off
> > sda: Mode Sense: 00 3a 00 00
> > SCSI device sda: drive cache: write back
> >
> >> Anyways, if your harddisk is doing this regularly,
> >> your hardware is faulty. Maybe the connection between the controller
> >> and the disk is the problem or the disk itself.
> >
> > I did not get those errors with Windows XP and i am not the only one who
> > has problems running this particular laptop model with a linux kernel.
> > Ok, to be honest, there's actually only one person i know of which
> > bothered enough about exactly the same errors to send me an e-mail (he
> > discovered at least one of my messages to this list). But in my
> > experience there are almost always others getting the same error, but
> > which remain silent.
>
> It might be that the drive is quirky and raises interrupts prematurely
> sometimes. Depending on how the driver performs recovery, the effect
> can be hidden from user. Can you try the attached patch and see how the
> kernel acts?
I tried the patch, but i couldn't see any changes in kerneloutput. I
also noticed, that there are actually two slightly different
error-messages.
#1 (shorter one, without HSM violation):
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: (BMDMA stat 0x21)
ata1.00: tag 0 cmd 0xc8 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient
ata1: port failed to respond (30 secs)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1: EH complete
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
#2 (longer, with HSM violation):
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x20)
ata1.00: tag 0 cmd 0xc8 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
ata1: soft resetting port
ata1: port is slow to respond, please be patient
ata1: port failed to respond (30 secs)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ATA: abnormal status 0xD8 on port 0xDCA44087
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: hard resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1: EH complete
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
Additionally, i attached the output of "smartctl --all -d ata /dev/sda"
Regards,
Martin Ammermüller
[-- Attachment #1.2: smart_all --]
[-- Type: text/plain, Size: 5055 bytes --]
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MK8032GSX
Serial Number: 26GI5560S
Firmware Version: AS111G
User Capacity: 80.026.361.856 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 6
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sat Aug 5 15:33:19 2006 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 331) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 65) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 2104
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 482
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 1729
10 Spin_Retry_Count 0x0033 109 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 394
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 97
193 Load_Cycle_Count 0x0032 089 089 000 Old_age Always - 111507
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 37 (Lifetime Min/Max 15/50)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 122
222 Loaded_Hours 0x0032 098 098 000 Old_age Always - 1168
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 324
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1729 -
# 2 Short offline Completed without error 00% 1728 -
# 3 Short offline Completed without error 00% 1128 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 191 bytes --]
next prev parent reply other threads:[~2006-08-05 14:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-24 20:50 [sata_sil] kernel 2.6.17(-mm2) test - timeout issue Martin Ammermüller
2006-06-25 3:06 ` Tejun Heo
2006-07-24 8:21 ` Martin Ammermüller
2006-07-30 20:22 ` Tejun Heo
2006-08-05 13:36 ` Martin Ammermüller [this message]
2006-08-06 15:51 ` Tejun Heo
2006-08-14 10:12 ` Martin Ammermüller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1154785005.9220.1.camel@localhost \
--to=tenco@gmx.de \
--cc=htejun@gmail.com \
--cc=jgarzik@pobox.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).