* SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
@ 2004-08-12 16:22 Michael G. Morey
2004-08-12 17:32 ` Bernd Schubert
0 siblings, 1 reply; 8+ messages in thread
From: Michael G. Morey @ 2004-08-12 16:22 UTC (permalink / raw)
To: Reiser Filesystem User List
[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]
All,
I've installed and run the smartmontools package backported to Debian
GNU/Linux 3.0r1 from http://www.backports.org, and have run the short
and extended Offline self tests. The results are as follows:
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 80%
4089 8371832
# 2 Short offline Completed without error 00%
4088 -
# 3 Short offline Completed without error 00%
0 -
My manager suggested that I reformat the partitions (we use ReiserFS
3.6), to mark the bad blocks. Is this a viable option? Can the drive
be salvaged, or should it be replaced?
Thanks in advance.
--
Michael G. Morey <mmorey@optivel.com>
Optivel
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-12 16:22 SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB Michael G. Morey
@ 2004-08-12 17:32 ` Bernd Schubert
2004-08-12 17:48 ` Michael G. Morey
2004-08-12 20:18 ` Lamont R. Peterson
0 siblings, 2 replies; 8+ messages in thread
From: Bernd Schubert @ 2004-08-12 17:32 UTC (permalink / raw)
To: reiserfs-list
> My manager suggested that I reformat the partitions (we use ReiserFS
> 3.6), to mark the bad blocks. Is this a viable option? Can the drive
> be salvaged, or should it be replaced?
>
Whats your definition of 'reformating'? Its certainly not
'mkreiserfs /dev/my_partition'. You would need a tool for a low level format
of the disk, I don't know if there are any for IDE disks.
Smartctl should also return the 'reallocated sector count' values, if those
are only a few ones, it might be worth a try to run 'badblocks
-n /dev/my_device', I think most drives only reallocate sectors when data are
written to them. After this has finished, try the badblocks and smartctl
command again, if there are still any errors, I would just get a new drive.
Somewhere on the namesys site you will also find an article about badblocks
handling with reiserfs, but I would do this only in absolut emergency if I
would need to rescue my data.
Cheers,
Bernd
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-12 17:32 ` Bernd Schubert
@ 2004-08-12 17:48 ` Michael G. Morey
2004-08-12 18:28 ` Bernd Schubert
2004-08-12 20:18 ` Lamont R. Peterson
1 sibling, 1 reply; 8+ messages in thread
From: Michael G. Morey @ 2004-08-12 17:48 UTC (permalink / raw)
To: Bernd Schubert; +Cc: Reiser Filesystem User List
[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]
On Thu, 2004-08-12 at 12:32, Bernd Schubert wrote:
> > My manager suggested that I reformat the partitions (we use ReiserFS
> > 3.6), to mark the bad blocks. Is this a viable option? Can the drive
> > be salvaged, or should it be replaced?
> >
>
> Whats your definition of 'reformating'? Its certainly not
> 'mkreiserfs /dev/my_partition'. You would need a tool for a low level format
> of the disk, I don't know if there are any for IDE disks.
>
> Smartctl should also return the 'reallocated sector count' values, if those
> are only a few ones, it might be worth a try to run 'badblocks
> -n /dev/my_device', I think most drives only reallocate sectors when data are
> written to them. After this has finished, try the badblocks and smartctl
> command again, if there are still any errors, I would just get a new drive.
>
> Somewhere on the namesys site you will also find an article about badblocks
> handling with reiserfs, but I would do this only in absolut emergency if I
> would need to rescue my data.
>
>
> Cheers,
> Bernd
Bernd,
The Reallocated Sector Count appears to be 91. I'm not entirely sure
how to interpret thye SMART Attributes with Thresholds table. What is
the meaning of the VALUE, WORST, and THRESH columns? What is your
assesment of my hard drive? I've attached the output of smartctl --all.
Thanks.
Michael
--
Michael G. Morey <mmorey@optivel.com>
Optivel
[-- Attachment #2: smartctl-all-turing.log --]
[-- Type: text/plain, Size: 9743 bytes --]
--- working directory: /home/mmorey/
% sudo smartctl --all /dev/hda
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: HITACHI_DK23FB-60
Serial Number: 1MG960
Firmware Version: 00M0A0C1
Device is: In smartctl database [for details use: -P show]
ATA Version is: 5
ATA Standard is: ATA/ATAPI-5 T13 1321D revision 3
Local Time is: Thu Aug 12 12:46:43 2004 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (2150) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 37) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000d 096 095 050 Pre-fail Offline - 201863463061
2 Throughput_Performance 0x0005 100 096 050 Pre-fail Offline - 3120
3 Spin_Up_Time 0x0007 100 100 050 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 519
5 Reallocated_Sector_Ct 0x0033 090 090 010 Pre-fail Always - 287
7 Seek_Error_Rate 0x000f 100 100 050 Pre-fail Always - 160
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 1179
9 Power_On_Minutes 0x0032 092 092 000 Old_age Always - 4114h+11m
10 Spin_Retry_Count 0x0013 100 100 050 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 505
191 G-Sense_Error_Rate 0x000a 100 093 000 Old_age Always - 77567
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 44
193 Load_Cycle_Count 0x0032 085 085 000 Old_age Always - 94302/94257
194 Temperature_Celsius 0x0022 078 050 000 Old_age Always - 51 (Lifetime Min/Max 65/15)
195 Hardware_ECC_Recovered 0x001a 090 001 000 Old_age Always - 7613
196 Reallocated_Event_Count 0x0032 072 072 000 Old_age Always - 287
197 Current_Pending_Sector 0x0032 099 098 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0013 100 100 050 Pre-fail Always - 0
201 Soft_Read_Error_Rate 0x0012 100 100 000 Old_age Always - 1
223 Load_Retry_Count 0x0012 100 100 000 Old_age Always - 0
230 Head_Amplitude 0x0032 094 094 000 Old_age Always - 180725
250 Read_Error_Retry_Rate 0x000a 100 001 000 Old_age Always - 789
SMART Error Log Version: 1
ATA Error Count: 103 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 103 occurred at disk power-on lifetime: 4113 hours (171 days + 9 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 71 3f ce 9f e0 Error: UNC 113 sectors at LBA = 0x009fce3f = 10473023
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 7c 34 ce 9f e0 00 3d+02:26:03.110 READ DMA
c8 00 7e 32 ce 9f e0 00 3d+02:26:01.250 READ DMA
c8 00 80 30 ce 9f e0 00 3d+02:25:59.060 READ DMA
c8 00 08 92 f4 ab e2 00 3d+02:25:59.040 READ DMA
c8 00 80 b0 cd 9f e0 00 3d+02:25:59.040 READ DMA
Error 102 occurred at disk power-on lifetime: 4113 hours (171 days + 9 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 73 3d ce 9f e0 Error: UNC 115 sectors at LBA = 0x009fce3d = 10473021
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 7e 32 ce 9f e0 00 3d+02:26:01.250 READ DMA
c8 00 80 30 ce 9f e0 00 3d+02:25:59.060 READ DMA
c8 00 08 92 f4 ab e2 00 3d+02:25:59.040 READ DMA
c8 00 80 b0 cd 9f e0 00 3d+02:25:59.040 READ DMA
c8 00 80 30 cd 9f e0 00 3d+02:25:59.010 READ DMA
Error 101 occurred at disk power-on lifetime: 4113 hours (171 days + 9 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 71 3f ce 9f e0 Error: UNC 113 sectors at LBA = 0x009fce3f = 10473023
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 30 ce 9f e0 00 3d+02:25:59.060 READ DMA
c8 00 08 92 f4 ab e2 00 3d+02:25:59.040 READ DMA
c8 00 80 b0 cd 9f e0 00 3d+02:25:59.040 READ DMA
c8 00 80 30 cd 9f e0 00 3d+02:25:59.010 READ DMA
c8 00 08 8a ea ab e2 00 3d+02:25:59.000 READ DMA
Error 100 occurred at disk power-on lifetime: 4113 hours (171 days + 9 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 75 3b 6a 9f e0 Error: UNC 117 sectors at LBA = 0x009f6a3b = 10447419
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 30 6a 9f e0 00 3d+02:25:47.570 READ DMA
c8 00 20 2a 84 14 e3 00 3d+02:25:47.500 READ DMA
ca 00 08 00 71 4a e4 00 3d+02:25:47.500 WRITE DMA
ca 00 08 20 70 4e e4 00 3d+02:25:47.500 WRITE DMA
ca 00 08 80 fb 4d e4 00 3d+02:25:47.490 WRITE DMA
Error 99 occurred at disk power-on lifetime: 4113 hours (171 days + 9 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 52 de 69 9f e0 Error: UNC 82 sectors at LBA = 0x009f69de = 10447326
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 b0 69 9f e0 00 3d+02:25:40.780 READ DMA
c8 00 80 30 69 9f e0 00 3d+02:25:40.450 READ DMA
c8 00 08 c2 25 d7 e2 00 3d+02:25:40.450 READ DMA
ca 00 08 68 b4 4a e4 00 3d+02:25:40.450 WRITE DMA
c8 00 80 b0 68 9f e0 00 3d+02:25:40.350 READ DMA
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 4103 -
# 2 Extended offline Completed: read failure 80% 4089 8371832
# 3 Short offline Completed without error 00% 4088 -
# 4 Short offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Exit 192 12:46:43
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-12 17:48 ` Michael G. Morey
@ 2004-08-12 18:28 ` Bernd Schubert
0 siblings, 0 replies; 8+ messages in thread
From: Bernd Schubert @ 2004-08-12 18:28 UTC (permalink / raw)
To: reiserfs-list
> The Reallocated Sector Count appears to be 91. I'm not entirely sure
I'm afraid its even worse, the value is 287, just go and replace this drive as
soon as possible. (My personal definition: everything below 10 is fine,
10-30 it gets critical and above 30 I will only save scratch data on it)
> how to interpret thye SMART Attributes with Thresholds table. What is
> the meaning of the VALUE, WORST, and THRESH columns? What is your
You can find this somewhere in the net, as far as I know those are harddrive
specific values which have no further meaning without exact knowledge about
your harddrive (please correct me if I'm wrong here).
The last column is interesting for you, it tells you the values in human
readable format.
Btw, the temperature of you drive is pretty high, for the new drive I would
try to get it below 40°Celsius. The high temperature might be the reason for
your drive failure.
> assesment of my hard drive? I've attached the output of smartctl --all.
As I said above, just replace it.
Cheers,
Bernd
--
Bernd Schubert
Physikalisch Chemisches Institut / Theoretische Chemie
Universität Heidelberg
INF 229
69120 Heidelberg
e-mail: bernd.schubert@pci.uni-heidelberg.de
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-12 17:32 ` Bernd Schubert
2004-08-12 17:48 ` Michael G. Morey
@ 2004-08-12 20:18 ` Lamont R. Peterson
2004-08-13 0:12 ` Philippe Gramoullé
1 sibling, 1 reply; 8+ messages in thread
From: Lamont R. Peterson @ 2004-08-12 20:18 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 386 bytes --]
On Thu, 2004-08-12 at 11:32, Bernd Schubert wrote:
> Whats your definition of 'reformating'? Its certainly not
> 'mkreiserfs /dev/my_partition'. You would need a tool for a low level format
> of the disk, I don't know if there are any for IDE disks.
SpinRite
[SNIP]
--
Lamont R. Peterson <lamont@gurulabs.com>
Senior Instructor
Guru Labs, L.C. http://www.GuruLabs.com/
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-12 20:18 ` Lamont R. Peterson
@ 2004-08-13 0:12 ` Philippe Gramoullé
2004-08-13 0:18 ` Lamont R. Peterson
0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gramoullé @ 2004-08-13 0:12 UTC (permalink / raw)
To: lamont; +Cc: reiserfs-list
Hello,
Are you aware of the same kind of tool for SCSI disks ( i need one for an "about to die" FUJITSU MAM3184 18Go 15K RPM disk)
Thanks,
Philippe
On Thu, 12 Aug 2004 14:18:59 -0600
"Lamont R. Peterson" <lamont@gurulabs.com> wrote:
| You would need a tool for a low level format
| > of the disk, I don't know if there are any for IDE disks.
|
| SpinRite
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-13 0:12 ` Philippe Gramoullé
@ 2004-08-13 0:18 ` Lamont R. Peterson
2004-08-13 2:12 ` Philippe Gramoullé
0 siblings, 1 reply; 8+ messages in thread
From: Lamont R. Peterson @ 2004-08-13 0:18 UTC (permalink / raw)
To: Philippe Gramoullé; +Cc: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 633 bytes --]
On Thu, 2004-08-12 at 18:12, Philippe Gramoullé wrote:
> Hello,
>
> Are you aware of the same kind of tool for SCSI disks ( i need one for an "about to die" FUJITSU MAM3184 18Go 15K RPM disk)
SpinRite also works for SCSI disks. Sorry, I do not have a URL. Google
for it.
> On Thu, 12 Aug 2004 14:18:59 -0600
> "Lamont R. Peterson" <lamont@gurulabs.com> wrote:
>
> | You would need a tool for a low level format
> | > of the disk, I don't know if there are any for IDE disks.
> |
> | SpinRite
--
Lamont R. Peterson <lamont@gurulabs.com>
Senior Instructor
Guru Labs, L.C. http://www.GuruLabs.com/
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB
2004-08-13 0:18 ` Lamont R. Peterson
@ 2004-08-13 2:12 ` Philippe Gramoullé
0 siblings, 0 replies; 8+ messages in thread
From: Philippe Gramoullé @ 2004-08-13 2:12 UTC (permalink / raw)
To: lamont; +Cc: reiserfs-list
Hello,
On Thu, 12 Aug 2004 18:18:06 -0600
"Lamont R. Peterson" <lamont@gurulabs.com> wrote:
| > Are you aware of the same kind of tool for SCSI disks ( i need one for an "about to die" FUJITSU MAM3184 18Go 15K RPM disk)
|
| SpinRite also works for SCSI disks. Sorry, I do not have a URL. Google
| for it.
Thanks, found it: http://www.grc.com/
There's a nice advertisement/review on Linux Journal : http://www.linuxjournal.com/article.php?sid=7684
Thanks,
Philippe
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-08-13 2:12 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-12 16:22 SMART Self-Test Reports UNC Errors on Dell Latitude D800 Hitachi Travelstar DK23FB Michael G. Morey
2004-08-12 17:32 ` Bernd Schubert
2004-08-12 17:48 ` Michael G. Morey
2004-08-12 18:28 ` Bernd Schubert
2004-08-12 20:18 ` Lamont R. Peterson
2004-08-13 0:12 ` Philippe Gramoullé
2004-08-13 0:18 ` Lamont R. Peterson
2004-08-13 2:12 ` Philippe Gramoullé
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.