* RAID 5 3-drive array failed 2 disks at once - can anything be saved?
@ 2013-09-13 14:55 Robert Schultz
2013-09-14 14:24 ` Phil Turmel
0 siblings, 1 reply; 7+ messages in thread
From: Robert Schultz @ 2013-09-13 14:55 UTC (permalink / raw)
To: linux-raid
Heeding the advice to ask questions before messing things up even worse,
here goes.
I have a PC running BackupPC.
The system contains 4 disks:
boot & system: 1x WD 20GB IDE
backup data: RAID 5 array containing 3 x Seagate 2TB SATA drives
ST32000542AS /dev/sdb
ST2000DM001 /dev/sdc
ST32000542AS /dev/sdd
Two days ago the system alerted me to a problem with the array:
A Fail event had been detected on md device /dev/md0.
It could be related to component device /dev/sdd1.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[3](F) sdd1[1](F) sdb1[0]
3906763776 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/1] [U__]
unused devices: <none>
followed by:
A FailSpare event had been detected on md device /dev/md0.
It could be related to component device /dev/sdc1.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[3](F) sdd1[1](F) sdb1[0]
3906763776 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/1] [U__]
unused devices: <none>
and then:
A Fail event had been detected on md device /dev/md0.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[3](F) sdd1[1](F) sdb1[0]
3906763776 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/1] [U__]
unused devices: <none>
I rebooted the machine and the system dropped to busybox after throwing
a bunch of errors like:
exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
BMDMA stat 0x64
failed command: READ DMA
cmd c8/00:08:08:08:00/00:00:00:00:00/f0 tag 0 dma 4096 in
res 51/40:00:0a:08:00/00:00:00:00:00/10 Emask 0x9 (media error)
status: { DRDY ERR }
error: { UNC }
I rebooted into Seatools and ran short tests. Drive sdd failed. I ran
the long test and repaired the disk. I assume this disk is completely
gone. It's under warranty and I'll have to open an RMA, even though at
this point Seatools thinks it is in fine share :-(
Unfortunately, for some reason the array failed sdc and Seatools shows
it as fine.
Here is the mdadm detail:
root@bkpr:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Fri May 31 11:06:39 2013
Raid Level : raid5
Used Dev Size : 1953381888 (1862.89 GiB 2000.26 GB)
Raid Devices : 3
Total Devices : 1
Persistence : Superblock is persistent
Update Time : Wed Sep 11 21:54:08 2013
State : active, FAILED, Not Started
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : bkp1:0
UUID : 77965a25:38a24b98:9ab5899c:7795ded7
Events : 308470
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 0 0 1 removed
2 0 0 2 removed
-----------------------------------------------------------------
Here is the mdadm examine for the three disks:
root@bkpr:~# mdadm --examine /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 77965a25:38a24b98:9ab5899c:7795ded7
Name : bkp1:0
Creation Time : Fri May 31 11:06:39 2013
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
Array Size : 3906763776 (3725.78 GiB 4000.53 GB)
Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 16788208:ea47ea51:fbbd84d9:1a2b61c7
Update Time : Wed Sep 11 21:54:08 2013
Checksum : 7d57a8ae - correct
Events : 308470
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : A.. ('A' == active, '.' == missing)
---------------------------------------------------------------------
root@bkpr:~# mdadm --examine /dev/sdd1
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 77965a25:38a24b98:9ab5899c:7795ded7
Name : bkp1:0
Creation Time : Fri May 31 11:06:39 2013
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
Array Size : 3906763776 (3725.78 GiB 4000.53 GB)
Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : 1d29c79a:2a7c1bb3:130cbed5:9afce2e8
Update Time : Wed Sep 11 03:34:39 2013
Checksum : 8e8eabd9 - correct
----------------------------------------------------------------------
root@bkpr:~# mdadm --examine /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 77965a25:38a24b98:9ab5899c:7795ded7
Name : bkp1:0
Creation Time : Fri May 31 11:06:39 2013
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
Array Size : 3906763776 (3725.78 GiB 4000.53 GB)
Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : 2d4ade03:d6b7e7ce:3744b40b:21a3d17e
Update Time : Wed Sep 11 03:34:39 2013
Checksum : df56e740 - correct
Events : 308467
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing)
Events : 308467
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAA ('A' == active, '.' == missing)
fdisk -l shows:
Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x10197396
Device Boot Start End Blocks Id System
/dev/sdb1 2048 3907029167 1953513560 fd Linux raid autodetect
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x08a89851
Device Boot Start End Blocks Id System
/dev/sdd1 2048 3907029167 1953513560 fd Linux raid autodetect
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
18 heads, 63 sectors/track, 3445352 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x5ebd3967
Device Boot Start End Blocks Id System
/dev/sdc1 2048 3907029167 1953513560 fd Linux raid autodetect
Odd (to me anyways) is that lshw shows sdc as having an ext4 filesystem. The array was using xfs.
*-disk:0
description: ATA Disk
product: ST32000542AS
vendor: Seagate
physical id: 0
bus info:scsi@2:0.1.0
logical name: /dev/sdb
version: CC34
serial: 5XW21KAF
size: 1863Gi
B (2TB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 signature=10197396
*-volume
description: Linux raid autodetect partition
physical id: 1
bus info:scsi@2:0.1.0,1
logical name: /dev/sdb1
capacity: 1863GiB
capabilities: primary multi
*-disk:1
description: ATA Disk
product: ST2000DM001-1CH1
vendor: Seagate
physical id: 0.0.0
bus info:scsi@3:0.0.0
logical name: /dev/sdc
version: CC24
serial: Z1E27DHL
size: 1863GiB (2TB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 signature=5ebd3967
*-volume
description: EXT4 volume
vendor: Linux
physical id: 1
bus info:scsi@3:0.0.0,1
logical name: /dev/sdc1
version: 1.0
serial: 7b6fdeb3-8632-450a-bc51-67c49ecc4ce9
size: 1863GiB
capacity: 1863GiB
capabilities: primary multi journaled extended_attributes large_files huge_files dir_nlink extents ext4 ext2 initialized
configuration: created=2013-05-17 11:56:52 filesystem=ext4 lastmountpoint=/mnt/2T modified=2013-06-15 21:52:50 mounted=2013-05-31 11:02:35 state=clean
*-disk:2
description: ATA Disk
product: ST32000542AS
vendor: Seagate
physical id: 1
bus info:scsi@3:0.1.0
logical name: /dev/sdd
version: CC34
serial: 5XW24A5V
size: 1863GiB (2TB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 signature=08a89851
*-volume
description: Linux raid autodetect partition
physical id: 1
bus info:scsi@3:0.1.0,1
logical name: /dev/sdd1
capacity: 1863GiB
capabilities: primary multi
*-serial UNCLAIMED
description: SMBus
product: N10/ICH 7 Family SMBus Controller
vendor: Intel Corporation
physical id: 1f.3
bus info:pci@0000:00:1f.3
version: 01
width: 32 bits
clock: 33MHz
configuration: latency=0
resources: ioport:400(size=32)
scd probably
did have an ext 4 filesystem at one time since it was used to back
up the RAID 1 array before converting to RAID 5.
So is there anything I can do before I attempt reassembling the
array?
Rob
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID 5 3-drive array failed 2 disks at once - can anything be saved?
2013-09-13 14:55 RAID 5 3-drive array failed 2 disks at once - can anything be saved? Robert Schultz
@ 2013-09-14 14:24 ` Phil Turmel
2013-09-15 20:42 ` Robert Schultz
0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2013-09-14 14:24 UTC (permalink / raw)
To: Robert Schultz; +Cc: linux-raid
Good morning Robert,
On 09/13/2013 10:55 AM, Robert Schultz wrote:
> Heeding the advice to ask questions before messing things up even worse,
> here goes.
>
> I have a PC running BackupPC.
>
> The system contains 4 disks:
> boot & system: 1x WD 20GB IDE
> backup data: RAID 5 array containing 3 x Seagate 2TB SATA drives
> ST32000542AS /dev/sdb
> ST2000DM001 /dev/sdc
> ST32000542AS /dev/sdd
>
> Two days ago the system alerted me to a problem with the array:
>
> A Fail event had been detected on md device /dev/md0.
>
> It could be related to component device /dev/sdd1.
>
> Faithfully yours, etc.
You can probably save everything. From the drive models given, you are
certainly suffering from timeout mismatch on desktop drives. Such
drives are not suitable for use in raid arrays "out of the box". For
many explanations of this, please search the list archives for various
combinations of "scterc", "error recovery", "device/timeout", and/or "URE".
Please provide a bit more information:
1) Redo your "mdadm -E /dec/sdd1", as you cut off part of its output.
2) show "for x in /sys/block/*/device/timeout ; do echo $x $(< $x) ;
done" to see your driver timeouts.
3) show "for x in sdb sdc sdd ; do echo $s ; smartctl -x /dev/$x ; done"
so we can see your drive health in detail, and the scterc capability.
(Sure to be none for the ST2000DM001 -- I have a couple of those.)
If I'm correct, saving your array will be the following steps:
1) Set long driver timeouts:
for x in /sys/block/*/device/timeout ; do echo 180 > $x ; done
2) Stop the array, then force assembly:
mdadm -S /dev/md0
mdadm -A --force /dev/md0 /dev/sd[bcd]1
3) Start a "check" scrub on your array:
echo check >/sys/block/md0/md/sync_action
The kernel MD driver only allows fixing 10 read errors per hour (after
20 in the first hour) before kicking a drive out anyways. If you've
accumulated many pending errors, your check may not finish. Simply
repeat "2" & "3" to get through.
4) If "mismatch_cnt" is non-zero at the end, also run a "repair" scrub.
5) Use "fsck -y" on your filesystem to fix any remaining errors, then
mount your filesystem.
6) Make a backup while you can.
7) Add "1" to your rc.local script so it is set on every reboot.
8) Add "3" to a weekly cron job so you don't let pending disk errors
accumulate.
HTH,
Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID 5 3-drive array failed 2 disks at once - can anything be saved?
2013-09-14 14:24 ` Phil Turmel
@ 2013-09-15 20:42 ` Robert Schultz
2013-09-16 1:12 ` Phil Turmel
0 siblings, 1 reply; 7+ messages in thread
From: Robert Schultz @ 2013-09-15 20:42 UTC (permalink / raw)
To: linux-raid; +Cc: Phil Turmel
Phil:
Thank you for the information. This is my backup machine. Up to this
point I wasn't concerned about having a second copy of this machine, but
I have a tendency to decommission a computer and leave the backups on by
backuppc for archive purposes. I probably don't really, really need
anything on this PC. That said I'm am very paranoid that I will have
some other failure before I can resolve this :-(
I hadn't read anything about timing in disks in RAID - I'll have to go
do some research. I see WD has their RED series that appears to be
directed to this market.
Here is the information requested. Please let me know if this changes
anything in your instructions. I'll hold off until you confirm.
>1) Redo your "mdadm -E /dec/sdd1", as you cut off part of its output.
root@bkpr:~# mdadm -E /dev/sdd1
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 77965a25:38a24b98:9ab5899c:7795ded7
Name : bkp1:0
Creation Time : Fri May 31 11:06:39 2013
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
Array Size : 3906763776 (3725.78 GiB 4000.53 GB)
Used Dev Size : 3906763776 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : 1d29c79a:2a7c1bb3:130cbed5:9afce2e8
Update Time : Wed Sep 11 03:34:39 2013
Checksum : 8e8eabd9 - correct
Events : 308467
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAA ('A' == active, '.' == missing)
>2) show "for x in /sys/block/*/device/timeout ; do echo $x $(< $x) ;
done" to see your driver timeouts.
root@bkpr:~# for x in /sys/block/*/device/timeout ; do echo $x $(< $x) ;
done
/sys/block/sda/device/timeout 30
/sys/block/sdb/device/timeout 30
/sys/block/sdc/device/timeout 30
/sys/block/sdd/device/timeout 30
/sys/block/sde/device/timeout 30
/sys/block/sdf/device/timeout 30
/sys/block/sdg/device/timeout 30
/sys/block/sdh/device/timeout 30
/sys/block/sdi/device/timeout 30
We're only interested in sdb/sdc/sdd for the array. sda is a 20GB IDE
drive as the boot/sys disk. f-i would be the USB live CD
>3) show "for x in sdb sdc sdd ; do echo $s ; smartctl -x /dev/$x ;
done" so we can see your drive health in detail, and the scterc
capability. (Sure to be none for the ST2000DM001 -- I have a couple of
those.)
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-21-generic] (local
build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda LP
Device Model: ST32000542AS
Serial Number: 5XW21KAF
LU WWN Device Id: 5 000c50 02ecc7806
Firmware Version: CC34
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sun Sep 15 14:14:44 2013 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test
routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 643) seconds.
Offline data collection
capabilities: (0x73) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-- 118 099 006 - 194290954
3 Spin_Up_Time PO---- 100 100 000 - 0
4 Start_Stop_Count -O--CK 100 100 020 - 223
5 Reallocated_Sector_Ct PO--CK 100 100 036 - 0
7 Seek_Error_Rate POSR-- 094 060 030 - 2889012633
9 Power_On_Hours -O--CK 076 076 000 - 21491
10 Spin_Retry_Count PO--C- 100 100 097 - 0
12 Power_Cycle_Count -O--CK 100 100 020 - 224
183 Runtime_Bad_Block -O--CK 100 100 000 - 0
184 End-to-End_Error -O--CK 100 100 099 - 0
187 Reported_Uncorrect -O--CK 100 100 000 - 0
188 Command_Timeout -O--CK 100 100 000 - 0
189 High_Fly_Writes -O-RCK 100 100 000 - 0
190 Airflow_Temperature_Cel -O---K 056 045 045 Past 44 (Min/Max
40/44)
194 Temperature_Celsius -O---K 044 055 000 - 44 (0 18 0 0)
195 Hardware_ECC_Recovered -O-RC- 046 021 000 - 194290954
197 Current_Pending_Sector -O--C- 100 100 000 - 0
198 Offline_Uncorrectable ----C- 100 100 000 - 0
199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0
240 Head_Flying_Hours ------ 100 253 000 - 140621524259782
241 Total_LBAs_Written ------ 100 253 000 - 1857243927
242 Total_LBAs_Read ------ 100 253 000 - 2508858022
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
ATA_READ_LOG_EXT (addr=0x00:0x00, page=0, n=1) failed: scsi error
aborted command
Read GP Log Directory failed.
SMART Log Directory Version 1 [multi-sector log support]
SMART Log at address 0x00 has 1 sectors [Log Directory]
SMART Log at address 0x01 has 1 sectors [Summary SMART error log]
SMART Log at address 0x02 has 5 sectors [Comprehensive SMART error log]
SMART Log at address 0x03 has 5 sectors [Ext. Comprehensive SMART
error log]
SMART Log at address 0x06 has 1 sectors [SMART self-test log]
SMART Log at address 0x07 has 1 sectors [Extended self-test log]
SMART Log at address 0x09 has 1 sectors [Selective self-test log]
SMART Log at address 0x10 has 1 sectors [NCQ Command Error]
SMART Log at address 0x11 has 1 sectors [SATA Phy Event Counters]
SMART Log at address 0x21 has 1 sectors [Write stream error log]
SMART Log at address 0x22 has 1 sectors [Read stream error log]
SMART Log at address 0x80 has 16 sectors [Host vendor specific log]
SMART Log at address 0x81 has 16 sectors [Host vendor specific log]
SMART Log at address 0x82 has 16 sectors [Host vendor specific log]
SMART Log at address 0x83 has 16 sectors [Host vendor specific log]
SMART Log at address 0x84 has 16 sectors [Host vendor specific log]
SMART Log at address 0x85 has 16 sectors [Host vendor specific log]
SMART Log at address 0x86 has 16 sectors [Host vendor specific log]
SMART Log at address 0x87 has 16 sectors [Host vendor specific log]
SMART Log at address 0x88 has 16 sectors [Host vendor specific log]
SMART Log at address 0x89 has 16 sectors [Host vendor specific log]
SMART Log at address 0x8a has 16 sectors [Host vendor specific log]
SMART Log at address 0x8b has 16 sectors [Host vendor specific log]
SMART Log at address 0x8c has 16 sectors [Host vendor specific log]
SMART Log at address 0x8d has 16 sectors [Host vendor specific log]
SMART Log at address 0x8e has 16 sectors [Host vendor specific log]
SMART Log at address 0x8f has 16 sectors [Host vendor specific log]
SMART Log at address 0x90 has 16 sectors [Host vendor specific log]
SMART Log at address 0x91 has 16 sectors [Host vendor specific log]
SMART Log at address 0x92 has 16 sectors [Host vendor specific log]
SMART Log at address 0x93 has 16 sectors [Host vendor specific log]
SMART Log at address 0x94 has 16 sectors [Host vendor specific log]
SMART Log at address 0x95 has 16 sectors [Host vendor specific log]
SMART Log at address 0x96 has 16 sectors [Host vendor specific log]
SMART Log at address 0x97 has 16 sectors [Host vendor specific log]
SMART Log at address 0x98 has 16 sectors [Host vendor specific log]
SMART Log at address 0x99 has 16 sectors [Host vendor specific log]
SMART Log at address 0x9a has 16 sectors [Host vendor specific log]
SMART Log at address 0x9b has 16 sectors [Host vendor specific log]
SMART Log at address 0x9c has 16 sectors [Host vendor specific log]
SMART Log at address 0x9d has 16 sectors [Host vendor specific log]
SMART Log at address 0x9e has 16 sectors [Host vendor specific log]
SMART Log at address 0x9f has 16 sectors [Host vendor specific log]
SMART Log at address 0xa1 has 20 sectors [Device vendor specific log]
SMART Log at address 0xa8 has 129 sectors [Device vendor specific log]
SMART Log at address 0xa9 has 1 sectors [Device vendor specific log]
SMART Log at address 0xbd has 252 sectors [Device vendor specific log]
SMART Log at address 0xc0 has 1 sectors [Device vendor specific log]
SMART Log at address 0xe0 has 1 sectors [SCT Command/Status]
SMART Log at address 0xe1 has 1 sectors [SCT Data Transfer]
SMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log Version: 1
No Errors Logged
SMART Extended Self-test Log (GP Log 0x07) not supported
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 21425 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 522 (0x020a)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 44 Celsius
Power Cycle Min/Max Temperature: 40/44 Celsius
Lifetime Min/Max Temperature: 18/55 Celsius
Under/Over Temperature Limit Count: 0/0
SCT Temperature History Version: 2
Temperature Sampling Period: 10 minutes
Temperature Logging Interval: 59 minutes
Min/Max recommended Temperature: 14/55 Celsius
Min/Max Temperature Limit: 10/60 Celsius
Temperature History Size (Index): 128 (19)
Index Estimated Time Temperature Celsius
20 2013-09-10 08:24 45 **************************
21 2013-09-10 09:23 44 *************************
... ..( 7 skipped). .. *************************
29 2013-09-10 17:15 44 *************************
30 2013-09-10 18:14 45 **************************
31 2013-09-10 19:13 47 ****************************
32 2013-09-10 20:12 46 ***************************
33 2013-09-10 21:11 48 *****************************
34 2013-09-10 22:10 51 ********************************
35 2013-09-10 23:09 51 ********************************
36 2013-09-11 00:08 50 *******************************
37 2013-09-11 01:07 47 ****************************
38 2013-09-11 02:06 45 **************************
39 2013-09-11 03:05 49 ******************************
40 2013-09-11 04:04 46 ***************************
41 2013-09-11 05:03 45 **************************
42 2013-09-11 06:02 45 **************************
43 2013-09-11 07:01 49 ******************************
44 2013-09-11 08:00 50 *******************************
45 2013-09-11 08:59 50 *******************************
46 2013-09-11 09:58 49 ******************************
47 2013-09-11 10:57 46 ***************************
48 2013-09-11 11:56 45 **************************
... ..( 5 skipped). .. **************************
54 2013-09-11 17:50 45 **************************
55 2013-09-11 18:49 47 ****************************
56 2013-09-11 19:48 47 ****************************
57 2013-09-11 20:47 47 ****************************
58 2013-09-11 21:46 50 *******************************
59 2013-09-11 22:45 51 ********************************
60 2013-09-11 23:44 50 *******************************
61 2013-09-12 00:43 50 *******************************
62 2013-09-12 01:42 46 ***************************
63 2013-09-12 02:41 49 ******************************
64 2013-09-12 03:40 48 *****************************
65 2013-09-12 04:39 45 **************************
... ..( 12 skipped). .. **************************
78 2013-09-12 17:26 45 **************************
79 2013-09-12 18:25 47 ****************************
80 2013-09-12 19:24 50 *******************************
81 2013-09-12 20:23 48 *****************************
82 2013-09-12 21:22 50 *******************************
83 2013-09-12 22:21 51 ********************************
84 2013-09-12 23:20 50 *******************************
85 2013-09-13 00:19 48 *****************************
86 2013-09-13 01:18 46 ***************************
87 2013-09-13 02:17 49 ******************************
88 2013-09-13 03:16 47 ****************************
89 2013-09-13 04:15 45 **************************
... ..( 7 skipped). .. **************************
97 2013-09-13 12:07 45 **************************
98 2013-09-13 13:06 46 ***************************
99 2013-09-13 14:05 45 **************************
100 2013-09-13 15:04 45 **************************
101 2013-09-13 16:03 46 ***************************
102 2013-09-13 17:02 49 ******************************
103 2013-09-13 18:01 51 ********************************
104 2013-09-13 19:00 52 *********************************
105 2013-09-13 19:59 48 *****************************
106 2013-09-13 20:58 51 ********************************
107 2013-09-13 21:57 51 ********************************
108 2013-09-13 22:56 49 ******************************
109 2013-09-13 23:55 46 ***************************
110 2013-09-14 00:54 46 ***************************
111 2013-09-14 01:53 46 ***************************
112 2013-09-14 02:52 45 **************************
... ..( 5 skipped). .. **************************
118 2013-09-14 08:46 45 **************************
119 2013-09-14 09:45 46 ***************************
... ..( 4 skipped). .. ***************************
124 2013-09-14 14:40 46 ***************************
125 2013-09-14 15:39 45 **************************
126 2013-09-14 16:38 ? -
127 2013-09-14 17:37 45 **************************
0 2013-09-14 18:36 ? -
1 2013-09-14 19:35 44 *************************
2 2013-09-14 20:34 ? -
3 2013-09-14 21:33 22 ***
4 2013-09-14 22:32 ? -
5 2013-09-14 23:31 27 ********
6 2013-09-15 00:30 ? -
7 2013-09-15 01:29 27 ********
8 2013-09-15 02:28 ? -
9 2013-09-15 03:27 40 *********************
10 2013-09-15 04:26 ? -
11 2013-09-15 05:25 40 *********************
12 2013-09-15 06:24 ? -
13 2013-09-15 07:23 40 *********************
14 2013-09-15 08:22 ? -
15 2013-09-15 09:21 40 *********************
16 2013-09-15 10:20 40 *********************
17 2013-09-15 11:19 44 *************************
18 2013-09-15 12:18 44 *************************
19 2013-09-15 13:17 44 *************************
SCT Error Recovery Control:
Read: Disabled
Write: Disabled
SATA Phy Event Counters (GP Log 0x11) not supported
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-21-generic] (local
build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: ST2000DM001-1CH164
Serial Number: Z1E27DHL
LU WWN Device Id: 5 000c50 04f1ae32f
Firmware Version: CC24
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sun Sep 15 14:14:44 2013 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test
routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 584) seconds.
Offline data collection
capabilities: (0x73) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 221) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x3085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-- 117 099 006 - 125736168
3 Spin_Up_Time PO---- 096 096 000 - 0
4 Start_Stop_Count -O--CK 100 100 020 - 14
5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0
7 Seek_Error_Rate POSR-- 076 060 030 - 13007233819
9 Power_On_Hours -O--CK 098 098 000 - 2592
10 Spin_Retry_Count PO--C- 100 100 097 - 0
12 Power_Cycle_Count -O--CK 100 100 020 - 14
183 Runtime_Bad_Block -O--CK 099 099 000 - 1
184 End-to-End_Error -O--CK 100 100 099 - 0
187 Reported_Uncorrect -O--CK 100 100 000 - 0
188 Command_Timeout -O--CK 100 100 000 - 0
189 High_Fly_Writes -O-RCK 097 097 000 - 3
190 Airflow_Temperature_Cel -O---K 053 040 045 Past 47 (16 76
47 41)
191 G-Sense_Error_Rate -O--CK 100 100 000 - 0
192 Power-Off_Retract_Count -O--CK 100 100 000 - 14
193 Load_Cycle_Count -O--CK 100 100 000 - 49
194 Temperature_Celsius -O---K 047 060 000 - 47 (0 20 0 0)
197 Current_Pending_Sector -O--C- 100 100 000 - 0
198 Offline_Uncorrectable ----C- 100 100 000 - 0
199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0
240 Head_Flying_Hours ------ 100 253 000 - 108306190305758
241 Total_LBAs_Written ------ 100 253 000 - 10738051480
242 Total_LBAs_Read ------ 100 253 000 - 22593654149
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
ATA_READ_LOG_EXT (addr=0x00:0x00, page=0, n=1) failed: scsi error
aborted command
Read GP Log Directory failed.
SMART Log Directory Version 1 [multi-sector log support]
SMART Log at address 0x00 has 1 sectors [Log Directory]
SMART Log at address 0x01 has 1 sectors [Summary SMART error log]
SMART Log at address 0x02 has 5 sectors [Comprehensive SMART error log]
SMART Log at address 0x06 has 1 sectors [SMART self-test log]
SMART Log at address 0x09 has 1 sectors [Selective self-test log]
SMART Log at address 0x80 has 16 sectors [Host vendor specific log]
SMART Log at address 0x81 has 16 sectors [Host vendor specific log]
SMART Log at address 0x82 has 16 sectors [Host vendor specific log]
SMART Log at address 0x83 has 16 sectors [Host vendor specific log]
SMART Log at address 0x84 has 16 sectors [Host vendor specific log]
SMART Log at address 0x85 has 16 sectors [Host vendor specific log]
SMART Log at address 0x86 has 16 sectors [Host vendor specific log]
SMART Log at address 0x87 has 16 sectors [Host vendor specific log]
SMART Log at address 0x88 has 16 sectors [Host vendor specific log]
SMART Log at address 0x89 has 16 sectors [Host vendor specific log]
SMART Log at address 0x8a has 16 sectors [Host vendor specific log]
SMART Log at address 0x8b has 16 sectors [Host vendor specific log]
SMART Log at address 0x8c has 16 sectors [Host vendor specific log]
SMART Log at address 0x8d has 16 sectors [Host vendor specific log]
SMART Log at address 0x8e has 16 sectors [Host vendor specific log]
SMART Log at address 0x8f has 16 sectors [Host vendor specific log]
SMART Log at address 0x90 has 16 sectors [Host vendor specific log]
SMART Log at address 0x91 has 16 sectors [Host vendor specific log]
SMART Log at address 0x92 has 16 sectors [Host vendor specific log]
SMART Log at address 0x93 has 16 sectors [Host vendor specific log]
SMART Log at address 0x94 has 16 sectors [Host vendor specific log]
SMART Log at address 0x95 has 16 sectors [Host vendor specific log]
SMART Log at address 0x96 has 16 sectors [Host vendor specific log]
SMART Log at address 0x97 has 16 sectors [Host vendor specific log]
SMART Log at address 0x98 has 16 sectors [Host vendor specific log]
SMART Log at address 0x99 has 16 sectors [Host vendor specific log]
SMART Log at address 0x9a has 16 sectors [Host vendor specific log]
SMART Log at address 0x9b has 16 sectors [Host vendor specific log]
SMART Log at address 0x9c has 16 sectors [Host vendor specific log]
SMART Log at address 0x9d has 16 sectors [Host vendor specific log]
SMART Log at address 0x9e has 16 sectors [Host vendor specific log]
SMART Log at address 0x9f has 16 sectors [Host vendor specific log]
SMART Log at address 0xa1 has 20 sectors [Device vendor specific log]
SMART Log at address 0xa8 has 129 sectors [Device vendor specific log]
SMART Log at address 0xa9 has 1 sectors [Device vendor specific log]
SMART Log at address 0xc0 has 1 sectors [Device vendor specific log]
SMART Log at address 0xc1 has 10 sectors [Device vendor specific log]
SMART Log at address 0xc4 has 5 sectors [Device vendor specific log]
SMART Log at address 0xe0 has 1 sectors [SCT Command/Status]
SMART Log at address 0xe1 has 1 sectors [SCT Data Transfer]
SMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log Version: 1
No Errors Logged
SMART Extended Self-test Log (GP Log 0x07) not supported
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 2526 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Warning: device does not support SCT Data Table command
Warning: device does not support SCT Error Recovery Control command
SATA Phy Event Counters (GP Log 0x11) not supported
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-21-generic] (local
build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda LP
Device Model: ST32000542AS
Serial Number: 5XW24A5V
LU WWN Device Id: 5 000c50 02ece5bf0
Firmware Version: CC34
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sun Sep 15 14:14:44 2013 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test
routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 633) seconds.
Offline data collection
capabilities: (0x73) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-- 095 095 006 - 164677781
3 Spin_Up_Time PO---- 100 100 000 - 0
4 Start_Stop_Count -O--CK 100 100 020 - 223
5 Reallocated_Sector_Ct PO--CK 100 100 036 - 0
7 Seek_Error_Rate POSR-- 094 060 030 - 2933855450
9 Power_On_Hours -O--CK 075 075 000 - 22657
10 Spin_Retry_Count PO--C- 100 100 097 - 0
12 Power_Cycle_Count -O--CK 100 100 020 - 224
183 Runtime_Bad_Block -O--CK 100 100 000 - 0
184 End-to-End_Error -O--CK 100 100 099 - 0
187 Reported_Uncorrect -O--CK 001 001 000 - 211
188 Command_Timeout -O--CK 100 099 000 - 4295032835
189 High_Fly_Writes -O-RCK 100 100 000 - 0
190 Airflow_Temperature_Cel -O---K 053 042 045 Past 47 (17 206
47 42)
194 Temperature_Celsius -O---K 047 058 000 - 47 (0 18 0 0)
195 Hardware_ECC_Recovered -O-RC- 041 024 000 - 164677781
197 Current_Pending_Sector -O--C- 098 095 000 - 91
198 Offline_Uncorrectable ----C- 098 095 000 - 91
199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0
240 Head_Flying_Hours ------ 100 253 000 - 163088498186598
241 Total_LBAs_Written ------ 100 253 000 - 2802097688
242 Total_LBAs_Read ------ 100 253 000 - 189741710
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
ATA_READ_LOG_EXT (addr=0x00:0x00, page=0, n=1) failed: scsi error
aborted command
Read GP Log Directory failed.
SMART Log Directory Version 1 [multi-sector log support]
SMART Log at address 0x00 has 1 sectors [Log Directory]
SMART Log at address 0x01 has 1 sectors [Summary SMART error log]
SMART Log at address 0x02 has 5 sectors [Comprehensive SMART error log]
SMART Log at address 0x03 has 5 sectors [Ext. Comprehensive SMART
error log]
SMART Log at address 0x06 has 1 sectors [SMART self-test log]
SMART Log at address 0x07 has 1 sectors [Extended self-test log]
SMART Log at address 0x09 has 1 sectors [Selective self-test log]
SMART Log at address 0x10 has 1 sectors [NCQ Command Error]
SMART Log at address 0x11 has 1 sectors [SATA Phy Event Counters]
SMART Log at address 0x21 has 1 sectors [Write stream error log]
SMART Log at address 0x22 has 1 sectors [Read stream error log]
SMART Log at address 0x80 has 16 sectors [Host vendor specific log]
SMART Log at address 0x81 has 16 sectors [Host vendor specific log]
SMART Log at address 0x82 has 16 sectors [Host vendor specific log]
SMART Log at address 0x83 has 16 sectors [Host vendor specific log]
SMART Log at address 0x84 has 16 sectors [Host vendor specific log]
SMART Log at address 0x85 has 16 sectors [Host vendor specific log]
SMART Log at address 0x86 has 16 sectors [Host vendor specific log]
SMART Log at address 0x87 has 16 sectors [Host vendor specific log]
SMART Log at address 0x88 has 16 sectors [Host vendor specific log]
SMART Log at address 0x89 has 16 sectors [Host vendor specific log]
SMART Log at address 0x8a has 16 sectors [Host vendor specific log]
SMART Log at address 0x8b has 16 sectors [Host vendor specific log]
SMART Log at address 0x8c has 16 sectors [Host vendor specific log]
SMART Log at address 0x8d has 16 sectors [Host vendor specific log]
SMART Log at address 0x8e has 16 sectors [Host vendor specific log]
SMART Log at address 0x8f has 16 sectors [Host vendor specific log]
SMART Log at address 0x90 has 16 sectors [Host vendor specific log]
SMART Log at address 0x91 has 16 sectors [Host vendor specific log]
SMART Log at address 0x92 has 16 sectors [Host vendor specific log]
SMART Log at address 0x93 has 16 sectors [Host vendor specific log]
SMART Log at address 0x94 has 16 sectors [Host vendor specific log]
SMART Log at address 0x95 has 16 sectors [Host vendor specific log]
SMART Log at address 0x96 has 16 sectors [Host vendor specific log]
SMART Log at address 0x97 has 16 sectors [Host vendor specific log]
SMART Log at address 0x98 has 16 sectors [Host vendor specific log]
SMART Log at address 0x99 has 16 sectors [Host vendor specific log]
SMART Log at address 0x9a has 16 sectors [Host vendor specific log]
SMART Log at address 0x9b has 16 sectors [Host vendor specific log]
SMART Log at address 0x9c has 16 sectors [Host vendor specific log]
SMART Log at address 0x9d has 16 sectors [Host vendor specific log]
SMART Log at address 0x9e has 16 sectors [Host vendor specific log]
SMART Log at address 0x9f has 16 sectors [Host vendor specific log]
SMART Log at address 0xa1 has 20 sectors [Device vendor specific log]
SMART Log at address 0xa8 has 129 sectors [Device vendor specific log]
SMART Log at address 0xa9 has 1 sectors [Device vendor specific log]
SMART Log at address 0xbd has 252 sectors [Device vendor specific log]
SMART Log at address 0xc0 has 1 sectors [Device vendor specific log]
SMART Log at address 0xe0 has 1 sectors [SCT Command/Status]
SMART Log at address 0xe1 has 1 sectors [SCT Data Transfer]
SMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log Version: 1
ATA Error Count: 204 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 204 occurred at disk power-on lifetime: 22591 hours (941 days + 7
hours)
When the command that caused the error occurred, the device was
active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 c0 08 00 00 Error: UNC at LBA = 0x000008c0 = 2240
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 0b 00 c0 08 00 e0 00 00:42:42.402 READ VERIFY SECTOR(S) EXT
42 0b 00 bd 08 00 e0 00 00:42:38.690 READ VERIFY SECTOR(S) EXT
42 0b 00 ba 08 00 e0 00 00:42:35.017 READ VERIFY SECTOR(S) EXT
42 0b 00 b7 08 00 e0 00 00:42:31.314 READ VERIFY SECTOR(S) EXT
42 0b 00 b5 08 00 e0 00 00:42:27.631 READ VERIFY SECTOR(S) EXT
Error 203 occurred at disk power-on lifetime: 22591 hours (941 days + 7
hours)
When the command that caused the error occurred, the device was
active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 bf 08 00 00 Error: UNC at LBA = 0x000008bf = 2239
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 0b 00 bd 08 00 e0 00 00:42:38.690 READ VERIFY SECTOR(S) EXT
42 0b 00 ba 08 00 e0 00 00:42:35.017 READ VERIFY SECTOR(S) EXT
42 0b 00 b7 08 00 e0 00 00:42:31.314 READ VERIFY SECTOR(S) EXT
42 0b 00 b5 08 00 e0 00 00:42:27.631 READ VERIFY SECTOR(S) EXT
42 0b 00 b4 08 00 e0 00 00:42:23.908 READ VERIFY SECTOR(S) EXT
Error 202 occurred at disk power-on lifetime: 22591 hours (941 days + 7
hours)
When the command that caused the error occurred, the device was
active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 bc 08 00 00 Error: UNC at LBA = 0x000008bc = 2236
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 0b 00 ba 08 00 e0 00 00:42:35.017 READ VERIFY SECTOR(S) EXT
42 0b 00 b7 08 00 e0 00 00:42:31.314 READ VERIFY SECTOR(S) EXT
42 0b 00 b5 08 00 e0 00 00:42:27.631 READ VERIFY SECTOR(S) EXT
42 0b 00 b4 08 00 e0 00 00:42:23.908 READ VERIFY SECTOR(S) EXT
42 0b 00 b0 08 00 e0 00 00:42:20.225 READ VERIFY SECTOR(S) EXT
Error 201 occurred at disk power-on lifetime: 22591 hours (941 days + 7
hours)
When the command that caused the error occurred, the device was
active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 b9 08 00 00 Error: UNC at LBA = 0x000008b9 = 2233
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 0b 00 b7 08 00 e0 00 00:42:31.314 READ VERIFY SECTOR(S) EXT
42 0b 00 b5 08 00 e0 00 00:42:27.631 READ VERIFY SECTOR(S) EXT
42 0b 00 b4 08 00 e0 00 00:42:23.908 READ VERIFY SECTOR(S) EXT
42 0b 00 b0 08 00 e0 00 00:42:20.225 READ VERIFY SECTOR(S) EXT
42 0b 00 af 08 00 e0 00 00:42:16.513 READ VERIFY SECTOR(S) EXT
Error 200 occurred at disk power-on lifetime: 22591 hours (941 days + 7
hours)
When the command that caused the error occurred, the device was
active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 b6 08 00 00 Error: UNC at LBA = 0x000008b6 = 2230
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 0b 00 b5 08 00 e0 00 00:42:27.631 READ VERIFY SECTOR(S) EXT
42 0b 00 b4 08 00 e0 00 00:42:23.908 READ VERIFY SECTOR(S) EXT
42 0b 00 b0 08 00 e0 00 00:42:20.225 READ VERIFY SECTOR(S) EXT
42 0b 00 af 08 00 e0 00 00:42:16.513 READ VERIFY SECTOR(S) EXT
42 0b 00 ae 08 00 e0 00 00:42:12.850 READ VERIFY SECTOR(S) EXT
SMART Extended Self-test Log (GP Log 0x07) not supported
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 22591
2241
# 2 Short offline Completed: read failure 90% 22591
2058
# 3 Short offline Completed: read failure 90% 22591
2058
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 522 (0x020a)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 47 Celsius
Power Cycle Min/Max Temperature: 42/47 Celsius
Lifetime Min/Max Temperature: 18/58 Celsius
Under/Over Temperature Limit Count: 0/0
SCT Temperature History Version: 2
Temperature Sampling Period: 10 minutes
Temperature Logging Interval: 59 minutes
Min/Max recommended Temperature: 14/55 Celsius
Min/Max Temperature Limit: 10/60 Celsius
Temperature History Size (Index): 128 (20)
Index Estimated Time Temperature Celsius
21 2013-09-10 08:24 49 ******************************
22 2013-09-10 09:23 48 *****************************
... ..( 7 skipped). .. *****************************
30 2013-09-10 17:15 48 *****************************
31 2013-09-10 18:14 49 ******************************
32 2013-09-10 19:13 50 *******************************
33 2013-09-10 20:12 51 ********************************
34 2013-09-10 21:11 50 *******************************
35 2013-09-10 22:10 55 ************************************
36 2013-09-10 23:09 55 ************************************
37 2013-09-11 00:08 54 ***********************************
38 2013-09-11 01:07 52 *********************************
39 2013-09-11 02:06 50 *******************************
40 2013-09-11 03:05 53 **********************************
41 2013-09-11 04:04 51 ********************************
42 2013-09-11 05:03 50 *******************************
43 2013-09-11 06:02 49 ******************************
44 2013-09-11 07:01 54 ***********************************
45 2013-09-11 08:00 55 ************************************
46 2013-09-11 08:59 55 ************************************
47 2013-09-11 09:58 53 **********************************
48 2013-09-11 10:57 50 *******************************
49 2013-09-11 11:56 49 ******************************
... ..( 5 skipped). .. ******************************
55 2013-09-11 17:50 49 ******************************
56 2013-09-11 18:49 50 *******************************
57 2013-09-11 19:48 52 *********************************
58 2013-09-11 20:47 51 ********************************
59 2013-09-11 21:46 55 ************************************
60 2013-09-11 22:45 56 *************************************
61 2013-09-11 23:44 55 ************************************
62 2013-09-12 00:43 55 ************************************
63 2013-09-12 01:42 52 *********************************
64 2013-09-12 02:41 52 *********************************
65 2013-09-12 03:40 54 ***********************************
66 2013-09-12 04:39 50 *******************************
67 2013-09-12 05:38 49 ******************************
... ..( 11 skipped). .. ******************************
79 2013-09-12 17:26 49 ******************************
80 2013-09-12 18:25 50 *******************************
81 2013-09-12 19:24 55 ************************************
82 2013-09-12 20:23 52 *********************************
83 2013-09-12 21:22 55 ************************************
84 2013-09-12 22:21 56 *************************************
85 2013-09-12 23:20 55 ************************************
86 2013-09-13 00:19 54 ***********************************
87 2013-09-13 01:18 51 ********************************
88 2013-09-13 02:17 52 *********************************
89 2013-09-13 03:16 53 **********************************
90 2013-09-13 04:15 50 *******************************
91 2013-09-13 05:14 49 ******************************
... ..( 3 skipped). .. ******************************
95 2013-09-13 09:10 49 ******************************
96 2013-09-13 10:09 50 *******************************
97 2013-09-13 11:08 50 *******************************
98 2013-09-13 12:07 49 ******************************
... ..( 3 skipped). .. ******************************
102 2013-09-13 16:03 49 ******************************
103 2013-09-13 17:02 53 **********************************
104 2013-09-13 18:01 56 *************************************
105 2013-09-13 19:00 57 **************************************
106 2013-09-13 19:59 53 **********************************
107 2013-09-13 20:58 56 *************************************
108 2013-09-13 21:57 56 *************************************
109 2013-09-13 22:56 56 *************************************
110 2013-09-13 23:55 51 ********************************
111 2013-09-14 00:54 51 ********************************
112 2013-09-14 01:53 50 *******************************
113 2013-09-14 02:52 50 *******************************
114 2013-09-14 03:51 49 ******************************
... ..( 4 skipped). .. ******************************
119 2013-09-14 08:46 49 ******************************
120 2013-09-14 09:45 50 *******************************
... ..( 5 skipped). .. *******************************
126 2013-09-14 15:39 50 *******************************
127 2013-09-14 16:38 ? -
0 2013-09-14 17:37 49 ******************************
1 2013-09-14 18:36 ? -
2 2013-09-14 19:35 48 *****************************
3 2013-09-14 20:34 ? -
4 2013-09-14 21:33 22 ***
5 2013-09-14 22:32 ? -
6 2013-09-14 23:31 27 ********
7 2013-09-15 00:30 ? -
8 2013-09-15 01:29 28 *********
9 2013-09-15 02:28 ? -
10 2013-09-15 03:27 42 ***********************
11 2013-09-15 04:26 ? -
12 2013-09-15 05:25 42 ***********************
13 2013-09-15 06:24 ? -
14 2013-09-15 07:23 42 ***********************
15 2013-09-15 08:22 ? -
16 2013-09-15 09:21 42 ***********************
17 2013-09-15 10:20 42 ***********************
18 2013-09-15 11:19 45 **************************
19 2013-09-15 12:18 47 ****************************
20 2013-09-15 13:17 47 ****************************
SCT Error Recovery Control:
Read: Disabled
Write: Disabled
SATA Phy Event Counters (GP Log 0x11) not supported
HTH, Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID 5 3-drive array failed 2 disks at once - can anything be saved?
2013-09-15 20:42 ` Robert Schultz
@ 2013-09-16 1:12 ` Phil Turmel
2013-09-19 2:29 ` Robert Schultz
0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2013-09-16 1:12 UTC (permalink / raw)
To: Robert Schultz; +Cc: linux-raid
Hi Robert,
On 09/15/2013 04:42 PM, Robert Schultz wrote:
> Phil:
>
> Thank you for the information. This is my backup machine. Up to this
> point I wasn't concerned about having a second copy of this machine, but
> I have a tendency to decommission a computer and leave the backups on by
> backuppc for archive purposes. I probably don't really, really need
> anything on this PC. That said I'm am very paranoid that I will have
> some other failure before I can resolve this :-(
>
> I hadn't read anything about timing in disks in RAID - I'll have to go
> do some research. I see WD has their RED series that appears to be
> directed to this market.
Please do read the archives on the topic. You won't regret it.
And yes, the WD REDs power up with SCTERC set properly. I bought four
of these for my new media server.
> Here is the information requested. Please let me know if this changes
> anything in your instructions. I'll hold off until you confirm.
One modest change. Two of your drives *do* support SCTERC, they just
have to have it enabled on every powerup:
> SCT Error Recovery Control:
> Read: Disabled
> Write: Disabled
For those two drives, your boot sequence should have:
smartctl -l scterc,70,70 /dev/sdX
For the other, you still need:
echo 180 >/sys/block/sdX/device/timeout
Otherwise, my recommendations stand.
Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID 5 3-drive array failed 2 disks at once - can anything be saved?
2013-09-16 1:12 ` Phil Turmel
@ 2013-09-19 2:29 ` Robert Schultz
2013-09-19 5:35 ` Phil Turmel
0 siblings, 1 reply; 7+ messages in thread
From: Robert Schultz @ 2013-09-19 2:29 UTC (permalink / raw)
To: linux-raid; +Cc: Phil Turmel
That all worked beautifully. Right up until I left the BackupPC running
against a RAID array with a bad disk.
It failed again after about 30 hours. The symptoms are the same.
I think I need to bring the array back up but leave that disk offline with:
mdadm --assemble --force /dev/md0 /dev/sdb /dev/sdc
(sdd is the bad drive)
Then follow the remainder of the steps to check.
I have a new disk on the way. I would then add this new disk into the
array and sync.
Does that sound correct?
Rob
On 13-09-15 09:12 PM, Phil Turmel wrote:
> Hi Robert,
>
> On 09/15/2013 04:42 PM, Robert Schultz wrote:
>> Phil:
>>
>> Thank you for the information. This is my backup machine. Up to this
>> point I wasn't concerned about having a second copy of this machine, but
>> I have a tendency to decommission a computer and leave the backups on by
>> backuppc for archive purposes. I probably don't really, really need
>> anything on this PC. That said I'm am very paranoid that I will have
>> some other failure before I can resolve this :-(
>>
>> I hadn't read anything about timing in disks in RAID - I'll have to go
>> do some research. I see WD has their RED series that appears to be
>> directed to this market.
> Please do read the archives on the topic. You won't regret it.
>
> And yes, the WD REDs power up with SCTERC set properly. I bought four
> of these for my new media server.
>
>> Here is the information requested. Please let me know if this changes
>> anything in your instructions. I'll hold off until you confirm.
> One modest change. Two of your drives *do* support SCTERC, they just
> have to have it enabled on every powerup:
>
>> SCT Error Recovery Control:
>> Read: Disabled
>> Write: Disabled
> For those two drives, your boot sequence should have:
>
> smartctl -l scterc,70,70 /dev/sdX
>
> For the other, you still need:
>
> echo 180 >/sys/block/sdX/device/timeout
>
> Otherwise, my recommendations stand.
>
> Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID 5 3-drive array failed 2 disks at once - can anything be saved?
2013-09-19 2:29 ` Robert Schultz
@ 2013-09-19 5:35 ` Phil Turmel
2013-09-19 17:38 ` Robert Schultz
0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2013-09-19 5:35 UTC (permalink / raw)
To: Robert Schultz; +Cc: linux-raid
On 09/18/2013 10:29 PM, Robert Schultz wrote:
> That all worked beautifully. Right up until I left the BackupPC running
> against a RAID array with a bad disk.
>
> It failed again after about 30 hours. The symptoms are the same.
>
Ah, well. "Smart" can't catch everything. Do consider that you might
have some other hardware problem, though. Power supply, data cable, etc.
> I think I need to bring the array back up but leave that disk offline with:
>
> mdadm --assemble --force /dev/md0 /dev/sdb /dev/sdc
>
> (sdd is the bad drive)
>
> Then follow the remainder of the steps to check.
Don't bother with another check run. It is only meaningfull if you have
all three drives.
> I have a new disk on the way. I would then add this new disk into the
> array and sync.
Yes.
> Does that sound correct?
Yes.
Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID 5 3-drive array failed 2 disks at once - can anything be saved?
2013-09-19 5:35 ` Phil Turmel
@ 2013-09-19 17:38 ` Robert Schultz
0 siblings, 0 replies; 7+ messages in thread
From: Robert Schultz @ 2013-09-19 17:38 UTC (permalink / raw)
To: linux-raid; +Cc: Phil Turmel
Seatools tells me the drive is bad. It found bad sectors and 'repaired'
them, then the disk passed the test. After the second failure Seatools
found more bad sectors. I have to assume it's the disk.
It is less than 3 months old. However I didn't register it so the
default warranty ends in Nov. Good thing I keep receipts.
Thanks for you help.
Rob
On 13-09-19 01:35 AM, Phil Turmel wrote:
> On 09/18/2013 10:29 PM, Robert Schultz wrote:
>> That all worked beautifully. Right up until I left the BackupPC running
>> against a RAID array with a bad disk.
>>
>> It failed again after about 30 hours. The symptoms are the same.
>>
> Ah, well. "Smart" can't catch everything. Do consider that you might
> have some other hardware problem, though. Power supply, data cable, etc.
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-09-19 17:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-13 14:55 RAID 5 3-drive array failed 2 disks at once - can anything be saved? Robert Schultz
2013-09-14 14:24 ` Phil Turmel
2013-09-15 20:42 ` Robert Schultz
2013-09-16 1:12 ` Phil Turmel
2013-09-19 2:29 ` Robert Schultz
2013-09-19 5:35 ` Phil Turmel
2013-09-19 17:38 ` Robert Schultz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).