raid1 issue after disk failure: both disks of the array are still active

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid1 issue after disk failure: both disks of the array are still active
@ 2012-09-13 10:01 Niccolò Belli
  2012-09-13 10:34 ` Robin Hill
  0 siblings, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-13 10:01 UTC (permalink / raw)
  To: linux-raid

Hi,
I have a raid1 array with two disks, distro is Squeeze amd64. /dev/sda 
is slowly dying, here is a snippet of "smartctl -a /dev/sda":

197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
       -       2
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age Offline 
      -       1

The bad sector is in the second half-MB of the disk, in fact with "dd 
if=/dev/sda1 of=/dev/null bs=524228 count=1 skip=1" I get this output in 
/var/log/syslog:

root@asterisk:~# dd if=/dev/sda1 of=/dev/null bs=524228 count=1 skip=1
0+1 record dentro
0+1 record fuori
430140 byte (430 kB) copiati, 11,7265 s, 36,7 kB/s

Sep 12 22:15:02 asterisk kernel: [ 8921.561978] dd: sending ioctl 
80306d02 to a partition!
Sep 12 22:15:02 asterisk kernel: [ 8921.561986] dd: sending ioctl 
80306d02 to a partition!
Sep 12 22:15:03 asterisk kernel: [ 8922.529099] ata3.00: exception Emask 
0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 12 22:15:03 asterisk kernel: [ 8922.531774] ata3.00: BMDMA stat 0x44
Sep 12 22:15:03 asterisk kernel: [ 8922.533547] ata3.00: failed command: 
READ DMA
Sep 12 22:15:03 asterisk kernel: [ 8922.535313] ata3.00: cmd 
c8/00:08:48:0f:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Sep 12 22:15:03 asterisk kernel: [ 8922.535316]          res 
51/40:00:48:0f:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 12 22:15:03 asterisk kernel: [ 8922.538891] ata3.00: status: { DRDY 
ERR }
Sep 12 22:15:03 asterisk kernel: [ 8922.540675] ata3.00: error: { UNC }
Sep 12 22:15:04 asterisk kernel: [ 8923.508206] ata3.00: configured for 
UDMA/133
Sep 12 22:15:04 asterisk kernel: [ 8923.508220] ata3: EH complete
Sep 12 22:15:05 asterisk kernel: [ 8924.469512] ata3.00: exception Emask 
0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 12 22:15:05 asterisk kernel: [ 8924.472323] ata3.00: BMDMA stat 0x44
Sep 12 22:15:05 asterisk kernel: [ 8924.475260] ata3.00: failed command: 
READ DMA
Sep 12 22:15:05 asterisk kernel: [ 8924.477023] ata3.00: cmd 
c8/00:08:48:0f:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Sep 12 22:15:05 asterisk kernel: [ 8924.477025]          res 
51/40:00:48:0f:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 12 22:15:05 asterisk kernel: [ 8924.480595] ata3.00: status: { DRDY 
ERR }
Sep 12 22:15:05 asterisk kernel: [ 8924.482370] ata3.00: error: { UNC }
Sep 12 22:15:06 asterisk kernel: [ 8925.452209] ata3.00: configured for 
UDMA/133
Sep 12 22:15:06 asterisk kernel: [ 8925.452224] ata3: EH complete
Sep 12 22:15:07 asterisk kernel: [ 8926.418504] ata3.00: exception Emask 
0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 12 22:15:07 asterisk kernel: [ 8926.420741] ata3.00: BMDMA stat 0x44
Sep 12 22:15:07 asterisk kernel: [ 8926.422486] ata3.00: failed command: 
READ DMA
Sep 12 22:15:07 asterisk kernel: [ 8926.424279] ata3.00: cmd 
c8/00:08:48:0f:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Sep 12 22:15:07 asterisk kernel: [ 8926.424281]          res 
51/40:00:48:0f:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 12 22:15:07 asterisk kernel: [ 8926.427861] ata3.00: status: { DRDY 
ERR }
Sep 12 22:15:07 asterisk kernel: [ 8926.429660] ata3.00: error: { UNC }
Sep 12 22:15:08 asterisk kernel: [ 8927.396270] ata3.00: configured for 
UDMA/133
Sep 12 22:15:08 asterisk kernel: [ 8927.396285] ata3: EH complete
Sep 12 22:15:09 asterisk kernel: [ 8928.359173] ata3.00: exception Emask 
0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 12 22:15:09 asterisk kernel: [ 8928.361647] ata3.00: BMDMA stat 0x44
Sep 12 22:15:09 asterisk kernel: [ 8928.364273] ata3.00: failed command: 
READ DMA
Sep 12 22:15:09 asterisk kernel: [ 8928.366028] ata3.00: cmd 
c8/00:08:48:0f:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Sep 12 22:15:09 asterisk kernel: [ 8928.366030]          res 
51/40:00:48:0f:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 12 22:15:09 asterisk kernel: [ 8928.369643] ata3.00: status: { DRDY 
ERR }
Sep 12 22:15:09 asterisk kernel: [ 8928.371420] ata3.00: error: { UNC }
Sep 12 22:15:10 asterisk kernel: [ 8929.340218] ata3.00: configured for 
UDMA/133
Sep 12 22:15:10 asterisk kernel: [ 8929.340233] ata3: EH complete
Sep 12 22:15:11 asterisk kernel: [ 8930.332648] ata3.00: exception Emask 
0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 12 22:15:11 asterisk kernel: [ 8930.334453] ata3.00: BMDMA stat 0x44
Sep 12 22:15:11 asterisk kernel: [ 8930.336245] ata3.00: failed command: 
READ DMA
Sep 12 22:15:11 asterisk kernel: [ 8930.337995] ata3.00: cmd 
c8/00:08:48:0f:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Sep 12 22:15:11 asterisk kernel: [ 8930.337998]          res 
51/40:00:48:0f:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 12 22:15:11 asterisk kernel: [ 8930.341583] ata3.00: status: { DRDY 
ERR }
Sep 12 22:15:11 asterisk kernel: [ 8930.343360] ata3.00: error: { UNC }
Sep 12 22:15:12 asterisk kernel: [ 8931.344205] ata3.00: configured for 
UDMA/133
Sep 12 22:15:12 asterisk kernel: [ 8931.344220] ata3: EH complete
Sep 12 22:15:13 asterisk kernel: [ 8932.306376] ata3.00: exception Emask 
0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 12 22:15:13 asterisk kernel: [ 8932.308201] ata3.00: BMDMA stat 0x44
Sep 12 22:15:13 asterisk kernel: [ 8932.309948] ata3.00: failed command: 
READ DMA
Sep 12 22:15:13 asterisk kernel: [ 8932.311695] ata3.00: cmd 
c8/00:08:48:0f:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Sep 12 22:15:13 asterisk kernel: [ 8932.311697]          res 
51/40:00:48:0f:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 12 22:15:13 asterisk kernel: [ 8932.315262] ata3.00: status: { DRDY 
ERR }
Sep 12 22:15:13 asterisk kernel: [ 8932.317070] ata3.00: error: { UNC }
Sep 12 22:15:14 asterisk kernel: [ 8933.284204] ata3.00: configured for 
UDMA/133
Sep 12 22:15:14 asterisk kernel: [ 8933.284234] sd 2:0:0:0: [sda] 
Unhandled sense code
Sep 12 22:15:14 asterisk kernel: [ 8933.284237] sd 2:0:0:0: [sda] 
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 12 22:15:14 asterisk kernel: [ 8933.284241] sd 2:0:0:0: [sda]  Sense 
Key : Medium Error [current] [descriptor]
Sep 12 22:15:14 asterisk kernel: [ 8933.284246] Descriptor sense data 
with sense descriptors (in hex):
Sep 12 22:15:14 asterisk kernel: [ 8933.284248]         72 03 11 04 00 
00 00 0c 00 0a 80 00 00 00 00 00
Sep 12 22:15:14 asterisk kernel: [ 8933.284256]         00 00 0f 48
Sep 12 22:15:14 asterisk kernel: [ 8933.284260] sd 2:0:0:0: [sda]  Add. 
Sense: Unrecovered read error - auto reallocate failed
Sep 12 22:15:14 asterisk kernel: [ 8933.284267] sd 2:0:0:0: [sda] CDB: 
Read(10): 28 00 00 00 0f 48 00 00 08 00
Sep 12 22:15:14 asterisk kernel: [ 8933.284274] end_request: I/O error, 
dev sda, sector 3912
Sep 12 22:15:14 asterisk kernel: [ 8933.286065] Buffer I/O error on 
device sda1, logical block 233
Sep 12 22:15:14 asterisk kernel: [ 8933.287889] ata3: EH complete


*Why doesn't it fail the first hard disk of the array!!??*

root@asterisk:~# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda2[0] sdb2[1]
       949236 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
       311619448 blocks super 1.2 [2/2] [UU]

unused devices: <none>


root@asterisk:~# mdadm --detail /dev/md0
/dev/md0:
         Version : 1.2
   Creation Time : Fri Jun 15 22:45:13 2012
      Raid Level : raid1
      Array Size : 311619448 (297.18 GiB 319.10 GB)
   Used Dev Size : 311619448 (297.18 GiB 319.10 GB)
    Raid Devices : 2
   Total Devices : 2
     Persistence : Superblock is persistent

     Update Time : Wed Sep 12 22:07:58 2012
           State : clean
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            Name : asterisk:0  (local to host asterisk)
            UUID : cea0c4c3:181e2ee3:e4d1f3c0:1008ea62
          Events : 68

     Number   Major   Minor   RaidDevice State
        0       8        1        0      active sync   /dev/sda1
        1       8       17        1      active sync   /dev/sdb1


As you can see the firmware of the hard disk reports a read error and 
linux still doesn't fail the drive: this is the best way to corrupt data
As far as I know it should fail the bad drive or at least try to resync 
it allowing the firmware to reallocate the bad sectors on write.

I really want to understand how raid1 is expected to work, I simply 
cannot trust something like this. I'd like to take advantage of the 
failure to learn something about linux's raid1 behavior.

Thanks,
Niccolò





More info about the failed disk:

root@asterisk:~# smartctl -a /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-2-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F1 DT
Device Model:     SAMSUNG HD322HJ
Serial Number:    S17AJDWQ402689
LU WWN Device Id: 5 0000f0 003046298
Firmware Version: 1AC01110
User Capacity:    320,072,933,376 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Wed Sep 12 22:27:56 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                         was never started.
                                         Auto Offline Data Collection: 
Disabled.
Self-test execution status:      ( 114) The previous self-test completed 
having
                                         the read element of the test 
failed.
Total time to complete Offline
data collection:                ( 3888) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                         Auto Offline data collection 
on/off support.
                                         Suspend Offline collection upon new
                                         command.
                                         Offline surface scan supported.
                                         Self-test supported.
                                         Conveyance Self-test supported.
                                         Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                         power-saving mode.
                                         Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                         General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  66) minutes.
Conveyance self-test routine
recommended polling time:        (   8) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                         SCT Error Recovery Control 
supported.
                                         SCT Feature Control supported.
                                         SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED 
WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   099   099   051    Pre-fail Always 
       -       428
   3 Spin_Up_Time            0x0007   094   094   011    Pre-fail Always 
       -       2810
   4 Start_Stop_Count        0x0032   099   099   000    Old_age Always 
       -       1077
   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail Always 
       -       0
   7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail Always 
       -       0
   8 Seek_Time_Performance   0x0025   100   100   015    Pre-fail 
Offline      -       9666
   9 Power_On_Hours          0x0032   098   098   000    Old_age Always 
       -       8915
  10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail Always 
       -       0
  11 Calibration_Retry_Count 0x0012   100   100   000    Old_age Always 
       -       0
  12 Power_Cycle_Count       0x0032   099   099   000    Old_age Always 
       -       1077
  13 Read_Soft_Error_Rate    0x000e   099   099   000    Old_age Always 
       -       400
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always 
       -       0
184 End-to-End_Error        0x0033   100   100   099    Pre-fail  Always 
       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always 
       -       400
188 Command_Timeout         0x0032   100   100   000    Old_age   Always 
       -       0
190 Airflow_Temperature_Cel 0x0022   063   055   000    Old_age   Always 
       -       37 (Min/Max 28/45)
194 Temperature_Celsius     0x0022   063   054   000    Old_age   Always 
       -       37 (Min/Max 28/46)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always 
       -       355155576
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always 
       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
       -       2
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age Offline 
      -       1
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always 
       -       0
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always 
       -       0
201 Soft_Read_Error_Rate    0x000a   096   096   000    Old_age   Always 
       -       361

SMART Error Log Version: 1
ATA Error Count: 173 (device log contains only the most recent five errors)
         CR = Command Register [HEX]
         FR = Features Register [HEX]
         SC = Sector Count Register [HEX]
         SN = Sector Number Register [HEX]
         CL = Cylinder Low Register [HEX]
         CH = Cylinder High Register [HEX]
         DH = Device/Head Register [HEX]
         DC = Device Command Register [HEX]
         ER = Error register [HEX]
         ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 173 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 00 48 0f 00 e0  Error: UNC at LBA = 0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 08 48 0f 00 e0 08  18d+09:02:03.824  READ DMA
   ec 00 00 00 00 00 a0 08  18d+09:02:03.814  IDENTIFY DEVICE
   ef 03 46 00 00 00 a0 08  18d+09:02:03.814  SET FEATURES [Set transfer 
mode]
   ec 00 00 00 00 00 a0 08  18d+09:02:02.824  IDENTIFY DEVICE

Error 172 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 00 48 0f 00 e0  Error: UNC at LBA = 0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 08 48 0f 00 e0 08  18d+09:02:01.814  READ DMA
   ec 00 00 00 00 00 a0 08  18d+09:02:01.804  IDENTIFY DEVICE
   ef 03 46 00 00 00 a0 08  18d+09:02:01.804  SET FEATURES [Set transfer 
mode]
   ec 00 00 00 00 00 a0 08  18d+09:02:00.854  IDENTIFY DEVICE

Error 171 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 00 48 0f 00 e0  Error: UNC at LBA = 0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 08 48 0f 00 e0 08  18d+09:01:59.874  READ DMA
   ec 00 00 00 00 00 a0 08  18d+09:01:59.864  IDENTIFY DEVICE
   ef 03 46 00 00 00 a0 08  18d+09:01:59.864  SET FEATURES [Set transfer 
mode]
   ec 00 00 00 00 00 a0 08  18d+09:01:58.904  IDENTIFY DEVICE

Error 170 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 00 48 0f 00 e0  Error: UNC at LBA = 0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 08 48 0f 00 e0 08  18d+09:01:57.924  READ DMA
   ec 00 00 00 00 00 a0 08  18d+09:01:57.924  IDENTIFY DEVICE
   ef 03 46 00 00 00 a0 08  18d+09:01:57.924  SET FEATURES [Set transfer 
mode]
   ec 00 00 00 00 00 a0 08  18d+09:01:56.964  IDENTIFY DEVICE

Error 169 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 00 48 0f 00 e0  Error: UNC at LBA = 0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 08 48 0f 00 e0 08  18d+09:01:55.984  READ DMA
   ec 00 00 00 00 00 a0 08  18d+09:01:55.974  IDENTIFY DEVICE
   ef 03 46 00 00 00 a0 08  18d+09:01:55.974  SET FEATURES [Set transfer 
mode]
   ec 00 00 00 00 00 a0 08  18d+09:01:55.014  IDENTIFY DEVICE

SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision 
number = 1
Num  Test_Description    Status                  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       20%      8895 
   3912
# 2  Short offline       Aborted by host               20%      8871      -
# 3  Short offline       Aborted by host               20%      8847      -
# 4  Short offline       Aborted by host               20%      8823      -
# 5  Extended offline    Aborted by host               90%      8800      -
# 6  Short offline       Aborted by host               20%      8799      -
# 7  Short offline       Aborted by host               20%      8775      -
# 8  Short offline       Aborted by host               20%      8751      -
# 9  Short offline       Aborted by host               20%      8727      -
#10  Short offline       Aborted by host               20%      8703      -
#11  Short offline       Aborted by host               20%      8679      -
#12  Short offline       Aborted by host               20%      8655      -
#13  Extended offline    Aborted by host               90%      8632      -
#14  Short offline       Aborted by host               20%      8631      -
#15  Short offline       Aborted by host               20%      8607      -
#16  Short offline       Aborted by host               20%      8583      -
#17  Short offline       Aborted by host               20%      8559      -
#18  Short offline       Aborted by host               20%      8535      -
#19  Short offline       Aborted by host               20%      8511      -
#20  Short offline       Aborted by host               20%      8487      -
#21  Extended offline    Aborted by host               90%      8464      -

Note: selective self-test log revision number (0) not 1 implies that no 
selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever 
been run
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing
Selective self-test flags (0x0):
   After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 10:01 raid1 issue after disk failure: both disks of the array are still active Niccolò Belli
@ 2012-09-13 10:34 ` Robin Hill
  2012-09-13 10:46   ` Niccolò Belli
  2012-09-13 17:02   ` Chris Murphy
  0 siblings, 2 replies; 27+ messages in thread
From: Robin Hill @ 2012-09-13 10:34 UTC (permalink / raw)
  To: Niccolò Belli; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1802 bytes --]

On Thu Sep 13, 2012 at 12:01:59PM +0200, Niccolò Belli wrote:

> Hi,
> I have a raid1 array with two disks, distro is Squeeze amd64. /dev/sda 
> is slowly dying, here is a snippet of "smartctl -a /dev/sda":
> 
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
>        -       2
> 198 Offline_Uncorrectable   0x0030   100   100   000    Old_age Offline 
>       -       1
> 
> The bad sector is in the second half-MB of the disk, in fact with "dd 
> if=/dev/sda1 of=/dev/null bs=524228 count=1 skip=1" I get this output in 
> /var/log/syslog:
> 
> root@asterisk:~# dd if=/dev/sda1 of=/dev/null bs=524228 count=1 skip=1
> 0+1 record dentro
> 0+1 record fuori
> 430140 byte (430 kB) copiati, 11,7265 s, 36,7 kB/s
> 
<- snip dmesg output ->
> 
> *Why doesn't it fail the first hard disk of the array!!??*
> 
Has anything actually attempted to read from that part of the array?
Even if so, it may just have happened to read from the working disk
anyway. md can only detect the error when it tries to read/write that
sector of that disk.

Your best bet now is to do an array check:
    echo check > /sys/block/md0/md/sync_action

This will force a read of all disks in the array. This should trigger
the read error, causing an attempt to re-write the faulty block, in turn
causing the drive remap the bad sector (assuming the re-write fails).
This should also be scheduled to run regularly for all arrays in order
to pick up these sort of issues before they cause major problems during
a rebuild.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 10:34 ` Robin Hill
@ 2012-09-13 10:46   ` Niccolò Belli
       [not found]     ` <5051BBC3.4050805@websitemanagers.com.au>
       [not found]     ` <CABYL=TpKD2B0vwTrHH=iFK3PcMWueEsi84ACRbBQkDXuiWG3kw@mail.gmail.com>
  2012-09-13 17:02   ` Chris Murphy
  1 sibling, 2 replies; 27+ messages in thread
From: Niccolò Belli @ 2012-09-13 10:46 UTC (permalink / raw)
  To: linux-raid

Il 13/09/2012 12:34, Robin Hill ha scritto:
> Has anything actually attempted to read from that part of the array?
> Even if so, it may just have happened to read from the working disk
> anyway. md can only detect the error when it tries to read/write that
> sector of that disk.

I forced a read with "dd if=/dev/md0 of=/dev/null bs=524228 count=1 
skip=1", I even get errors in syslog!

> Your best bet now is to do an array check:
>      echo check>  /sys/block/md0/md/sync_action
>
> This will force a read of all disks in the array. This should trigger
> the read error, causing an attempt to re-write the faulty block, in turn
> causing the drive remap the bad sector (assuming the re-write fails).
> This should also be scheduled to run regularly for all arrays in order
> to pick up these sort of issues before they cause major problems during
> a rebuild.

/etc/init.d/mdadm should do exactly this kind of things (distro is 
Debian Squeeze). I have this in cron.d:
57 0 * * 0 root if [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) 
-le 7 ]; then /usr/share/mdadm/checkarray --cron --all --idle --quiet; fi

Unfortunately it seems it didn't work :(

Shouldn't a dd if=/dev/md0 be enough to trigger the read error?

Thanks,
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

[parent not found: <5051BBC3.4050805@websitemanagers.com.au>]

* Re: raid1 issue after disk failure: both disks of the array are still active
       [not found]     ` <5051BBC3.4050805@websitemanagers.com.au>
@ 2012-09-13 11:29       ` Niccolò Belli
  0 siblings, 0 replies; 27+ messages in thread
From: Niccolò Belli @ 2012-09-13 11:29 UTC (permalink / raw)
  To: linux-raid

Il 13/09/2012 12:56, Adam Goryachev ha scritto:
> See /etc/default/mdadm:
> # AUTOCHECK:
> #   should mdadm run periodic redundancy checks over your arrays? See
> #   /etc/cron.d/mdadm.
> AUTOCHECK=true
>
> Regards,
> Adam

I set it on true since I installed the system.

Cheers,
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

[parent not found: <CABYL=TpKD2B0vwTrHH=iFK3PcMWueEsi84ACRbBQkDXuiWG3kw@mail.gmail.com>]

* Re: raid1 issue after disk failure: both disks of the array are still active
       [not found]     ` <CABYL=TpKD2B0vwTrHH=iFK3PcMWueEsi84ACRbBQkDXuiWG3kw@mail.gmail.com>
@ 2012-09-13 15:32       ` Roberto Spadim
  2012-09-13 15:48         ` Niccolò Belli
  0 siblings, 1 reply; 27+ messages in thread
From: Roberto Spadim @ 2012-09-13 15:32 UTC (permalink / raw)
  To: Niccolò Belli; +Cc: linux-raid

2012/9/13 Niccolò Belli <darkbasic@linuxsystems.it>
>
> Il 13/09/2012 12:34, Robin Hill ha scritto:
>
>> Has anything actually attempted to read from that part of the array?
>> Even if so, it may just have happened to read from the working disk
>> anyway. md can only detect the error when it tries to read/write that
>> sector of that disk.
>
>
> I forced a read with "dd if=/dev/md0 of=/dev/null bs=524228 count=1 skip=1", I even get errors in syslog!

you forced read from a block device (md0) that have a md raid1. check
in source of raid1 read balance... it don´t read from all disks, just
from 'near/fasters' disks. to read from all, run check like the
command echo check > /sys/block/md0/md/sync_action

>
>
>
>> Your best bet now is to do an array check:
>>      echo check>  /sys/block/md0/md/sync_action
>>
>> This will force a read of all disks in the array. This should trigger
>> the read error, causing an attempt to re-write the faulty block, in turn
>> causing the drive remap the bad sector (assuming the re-write fails).
>> This should also be scheduled to run regularly for all arrays in order
>> to pick up these sort of issues before they cause major problems during
>> a rebuild.
>
>
> /etc/init.d/mdadm should do exactly this kind of things (distro is Debian Squeeze). I have this in cron.d:
> 57 0 * * 0 root if [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le 7 ]; then /usr/share/mdadm/checkarray --cron --all --idle --quiet; fi
>
> Unfortunately it seems it didn't work :(

i´m not a debian expert but, it´s easier to put the bash logic inside
a bash file, and call this file from crond, maybe some part of your
configuration file of cron failed?

>
>
> Shouldn't a dd if=/dev/md0 be enough to trigger the read error?

no

>
>
> Thanks,
> Niccolò
>
> --
> http://www.linuxsystems.it
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Roberto Spadim
Spadim Technology / SPAEmpresarial



--
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 15:32       ` Roberto Spadim
@ 2012-09-13 15:48         ` Niccolò Belli
  2012-09-13 15:53           ` Roberto Spadim
  0 siblings, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-13 15:48 UTC (permalink / raw)
  To: linux-raid

Il 13/09/2012 17:32, Roberto Spadim ha scritto:
> you forced read from a block device (md0) that have a md raid1. check
> in source of raid1 read balance... it don´t read from all disks, just
> from 'near/fasters' disks.

I know, but since reading from md0 triggered a warning in dmesg (like 
the previous I posted) it *did* read from the broken disk!!! Both disks 
of the array are still active even after the read error, why?

Cheers,
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 15:48         ` Niccolò Belli
@ 2012-09-13 15:53           ` Roberto Spadim
  2012-09-14  7:54             ` Niccolò Belli
  0 siblings, 1 reply; 27+ messages in thread
From: Roberto Spadim @ 2012-09-13 15:53 UTC (permalink / raw)
  To: Niccolò Belli; +Cc: linux-raid

check the read error counters of you md in sys folder
(max_read_errors,  device/errors)
maybe it got a error and retried and readed without error

2012/9/13 Niccolò Belli <darkbasic@linuxsystems.it>:
> Il 13/09/2012 17:32, Roberto Spadim ha scritto:
>
>> you forced read from a block device (md0) that have a md raid1. check
>> in source of raid1 read balance... it don´t read from all disks, just
>> from 'near/fasters' disks.
>
>
> I know, but since reading from md0 triggered a warning in dmesg (like the
> previous I posted) it *did* read from the broken disk!!! Both disks of the
> array are still active even after the read error, why?
>
> Cheers,
> Niccolņ
>
> --
> http://www.linuxsystems.it
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 15:53           ` Roberto Spadim
@ 2012-09-14  7:54             ` Niccolò Belli
  0 siblings, 0 replies; 27+ messages in thread
From: Niccolò Belli @ 2012-09-14  7:54 UTC (permalink / raw)
  To: linux-raid

Il 13/09/2012 17:53, Roberto Spadim ha scritto:
> check the read error counters of you md in sys folder
> (max_read_errors,  device/errors)
> maybe it got a error and retried and readed without error

max_read_errors is 20 and md0/errors is 0. Unfortunately I did reboot in 
the meantime and I cannot manage to trigger the read error anymore while 
reading md0 :(
-- 
http://www.linuxsystems.it

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 10:34 ` Robin Hill
  2012-09-13 10:46   ` Niccolò Belli
@ 2012-09-13 17:02   ` Chris Murphy
  2012-09-13 17:39     ` Roberto Spadim
  2012-09-14  7:16     ` Mikael Abrahamsson
  1 sibling, 2 replies; 27+ messages in thread
From: Chris Murphy @ 2012-09-13 17:02 UTC (permalink / raw)
  To: Linux RAID

On Sep 13, 2012, at 4:34 AM, Robin Hill wrote:
> 
> Your best bet now is to do an array check:
>    echo check > /sys/block/md0/md/sync_action
> 
> This will force a read of all disks in the array. This should trigger
> the read error, causing an attempt to re-write the faulty block, in turn
> causing the drive remap the bad sector (assuming the re-write fails).

"check" records errors, no action is taken by the md driver to correct it, although the disk firmware itself may try reallocation. So far, that appears to not be the case. 

"repair" causes the md driver to write correct data (from copy or reconstructed from parity), which should force the disk firmware to reallocate the affected LBAs from bad physical sectors to good ones.

It seems in this case "repair" is indicated.

Chris Murphy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 17:02   ` Chris Murphy
@ 2012-09-13 17:39     ` Roberto Spadim
  2012-09-13 20:13       ` Chris Murphy
  2012-09-14  7:16     ` Mikael Abrahamsson
  1 sibling, 1 reply; 27+ messages in thread
From: Roberto Spadim @ 2012-09-13 17:39 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux RAID

> "check" records errors, no action is taken by the md driver to correct it, although the disk firmware itself may try reallocation. So far, that appears to not be the case.
>
> "repair" causes the md driver to write correct data (from copy or reconstructed from parity), which should force the disk firmware to reallocate the affected LBAs from bad physical sectors to good ones.
>
> It seems in this case "repair" is indicated.
>
Or replace the bad disk =)


-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 17:39     ` Roberto Spadim
@ 2012-09-13 20:13       ` Chris Murphy
  0 siblings, 0 replies; 27+ messages in thread
From: Chris Murphy @ 2012-09-13 20:13 UTC (permalink / raw)
  To: Linux RAID


On Sep 13, 2012, at 11:39 AM, Roberto Spadim wrote:

>> "check" records errors, no action is taken by the md driver to correct it, although the disk firmware itself may try reallocation. So far, that appears to not be the case.
>> 
>> "repair" causes the md driver to write correct data (from copy or reconstructed from parity), which should force the disk firmware to reallocate the affected LBAs from bad physical sectors to good ones.
>> 
>> It seems in this case "repair" is indicated.
>> 
> Or replace the bad disk =)

Yes or replace the disk. But from the SMART info provided, it's just a few sectors that are affected. None of the attribute values have even budged so far.

Chris Murphy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-13 17:02   ` Chris Murphy
  2012-09-13 17:39     ` Roberto Spadim
@ 2012-09-14  7:16     ` Mikael Abrahamsson
  2012-09-14  7:45       ` Niccolò Belli
  2012-09-14  8:13       ` NeilBrown
  1 sibling, 2 replies; 27+ messages in thread
From: Mikael Abrahamsson @ 2012-09-14  7:16 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux RAID

On Thu, 13 Sep 2012, Chris Murphy wrote:

> "check" records errors, no action is taken by the md driver to correct 
> it, although the disk firmware itself may try reallocation. So far, that 
> appears to not be the case.
>
> "repair" causes the md driver to write correct data (from copy or 
> reconstructed from parity), which should force the disk firmware to 
> reallocate the affected LBAs from bad physical sectors to good ones.
>
> It seems in this case "repair" is indicated.

I was under the impression that "check" would check if all data blocks and 
parity are correct, and record if there is a parity mismatch. This would 
then be corrected by using "repair" at a later time.

I was also under the impression that if there was a read error on a drive 
during "check", that read error would be corrected using parity because 
it's obviously a hard error, not a logical error.

Could you (or someone else) please confirm that my impression is wrong and 
if there indeed is a hard read error using "check", this will not be 
corrected? I would be interested in knowing why this decision was taken to 
have this behaviour, as I feel that if there is a hard read error, this 
should always be corrected using parity.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-14  7:16     ` Mikael Abrahamsson
@ 2012-09-14  7:45       ` Niccolò Belli
  2012-09-14 18:04         ` Chris Murphy
  2012-09-14  8:13       ` NeilBrown
  1 sibling, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-14  7:45 UTC (permalink / raw)
  To: linux-raid

I also would like to know if the raid1 will *surely* use data from the 
other disk to write on the broken sector after a CHECK. I mean, i did 
nothing even after a read error on md0 with a "failed command: READ DMA" 
in dmesg (possibly because after a few reads it succeeded reading?). I 
read that when raid1 is in doubt there is a 50%-50% chance it uses data 
from the good disk, wouldn't be better to fail the broken disk and then 
re-add it to the array?

Cheers,
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-14  7:45       ` Niccolò Belli
@ 2012-09-14 18:04         ` Chris Murphy
  2012-09-14 18:27           ` Robin Hill
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Murphy @ 2012-09-14 18:04 UTC (permalink / raw)
  To: Linux RAID

On Sep 14, 2012, at 1:45 AM, Niccolò Belli wrote:

> I also would like to know if the raid1 will *surely* use data from the other disk to write on the broken sector after a CHECK.

Not according to documentation. In normal operation, and for a repair, what you describe is correct. But not for check.

> I mean, i did nothing even after a read error on md0 with a "failed command: READ DMA" in dmesg (possibly because after a few reads it succeeded reading?).

Possibly. Possibly the disk firmware finally was able to relocate that sector's data. Possibly its ECC thinks it has reconstructed the data on that sector, but in fact the data is corrupt.

> I read that when raid1 is in doubt there is a 50%-50% chance it uses data from the good disk, wouldn't be better to fail the broken disk and then re-add it to the array?

I don't know what this means.

But I think there's a misunderstanding about disk behavior. A disk's reliability is not always a binary condition. Most often it's a continuum, because sector problems are masked by disk's ECC, and they go entirely unreported to the kernel, and thus md. This includes the case when the disk ECC detects an error, and thinks it has corrected it, but actually returns bogus (corrupt) data rather than a read error; as well as when disk ECC does not detect an error at all, but the data is in fact corrupt.

The md driver has no practical choice but to trust the data the disk returns, absent an error. So I'm confused by what you mean by "when raid1 is in doubt" and what you mean by this "50/50 chance" part.

When the disk ECC detects an error, and fails to correct it, only then will it report a read error to the kernel, and then md will get that data elsewhere. There is no good reason for md to mark a 99.99% correctly performing disk as faulty. If it did this, you've unnecessarily abandoned those 99.99% useful sectors, and in so doing have significantly reduced redundancy.

There's a reason why there are check and repair functions, rather than wholesale discarding an otherwise valuable disk with a handful of bad sectors (or even one), and the ensuing loss of redundancy. Check is read only so it will be faster than repair, is a good reason to use check frequently and repair less frequently unless check warrants it. And there's good reason to include some smartd periodic testing as well since there are parts of the disk that md check/repair can't test *and* because the disk ECC masks problems, whereas SMART should report them.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-14 18:04         ` Chris Murphy
@ 2012-09-14 18:27           ` Robin Hill
  2012-09-14 18:53             ` Chris Murphy
  0 siblings, 1 reply; 27+ messages in thread
From: Robin Hill @ 2012-09-14 18:27 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1958 bytes --]

On Fri Sep 14, 2012 at 12:04:56 -0600, Chris Murphy wrote:

> 
> On Sep 14, 2012, at 1:45 AM, Niccolò Belli wrote:
> 
> > I also would like to know if the raid1 will *surely* use data from
> > the other disk to write on the broken sector after a CHECK.
> 
> Not according to documentation. In normal operation, and for a repair,
> what you describe is correct. But not for check.
> 
Maybe you need to reread the documentation. The md manual page says:

    Requesting a scrub will cause md to read every block on every device
    in the array, and check that the data is consistent. For RAID1 and
    RAID10, this means checking that the copies are identical. For
    RAID4, RAID5, RAID6 this means checking that the parity block is (or
    blocks are) correct.

    If a read error is detected during this process, the normal
    read-error handling causes correct data to be found from other
    devices and to be written back to the faulty device. In many case
    this will effectively fix the bad block.

So a check will repair cases where the data cannot be read at all, but
will not repair cases where the data is returned but does not match the
data on the other mirror(s).

> > I read that when raid1 is in doubt there is a 50%-50% chance it uses
> > data from the good disk, wouldn't be better to fail the broken disk
> > and then re-add it to the array?
> 
> I don't know what this means.
> 
I assume he's referring to cases where the data is read successfully but
does not match the data on the mirror(s). In this case a repair will
cause one copy to overwrite the others, which may or may not be the
correct copy (md has no way of knowing for a mirrored pair).

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-14 18:27           ` Robin Hill
@ 2012-09-14 18:53             ` Chris Murphy
  2012-09-15 19:05               ` Niccolò Belli
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Murphy @ 2012-09-14 18:53 UTC (permalink / raw)
  To: Linux RAID


On Sep 14, 2012, at 12:27 PM, Robin Hill wrote:

> On Fri Sep 14, 2012 at 12:04:56 -0600, Chris Murphy wrote:
> 
>> 
>> On Sep 14, 2012, at 1:45 AM, Niccolò Belli wrote:
>> 
>>> I also would like to know if the raid1 will *surely* use data from
>>> the other disk to write on the broken sector after a CHECK.
>> 
>> Not according to documentation. In normal operation, and for a repair,
>> what you describe is correct. But not for check.
>> 
> Maybe you need to reread the documentation.

Probably. It's densely packed.

> So a check will repair cases where the data cannot be read at all, but
> will not repair cases where the data is returned but does not match the
> data on the other mirror(s).

Yes, I now see the distinction between disk read-error and an array block mismatch.

> 
>>> I read that when raid1 is in doubt there is a 50%-50% chance it uses
>>> data from the good disk, wouldn't be better to fail the broken disk
>>> and then re-add it to the array?
>> 
>> I don't know what this means.
>> 
> I assume he's referring to cases where the data is read successfully but
> does not match the data on the mirror(s). In this case a repair will
> cause one copy to overwrite the others, which may or may not be the
> correct copy (md has no way of knowing for a mirrored pair).


I understand the ambiguity. It seems ill advised to arbitrarily replace what could be valid data. So some clarification on what repair does in a raid 1,10 block mismatch would be useful, as it may be repair shouldn't be used: rather use check and find out what file(s) are affected by the mismatch and replace the files from backup.

This statement from documentation is confusing to me: "For RAID1/RAID10, all but one block are overwritten with the content of that one block."

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-14 18:53             ` Chris Murphy
@ 2012-09-15 19:05               ` Niccolò Belli
  2012-09-15 19:41                 ` Robin Hill
  0 siblings, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-15 19:05 UTC (permalink / raw)
  To: linux-raid

CHECK didn't help me, so I did a echo "repair > 
/sys/block/md0/md/sync_action". REPAIR didn't work too :(

Here is syslog of REPAIR:

Sep 15 19:34:10 asterisk mdadm[2117]: RebuildStarted event detected on 
md device /dev/md/0
Sep 15 19:34:10 asterisk kernel: [258470.152296] md: requested-resync of 
RAID array md0
Sep 15 19:34:10 asterisk kernel: [258470.152301] md: minimum 
_guaranteed_  speed: 1000 KB/sec/disk.
Sep 15 19:34:10 asterisk kernel: [258470.152304] md: using maximum 
available idle IO bandwidth (but not more than 200000 KB/sec) for 
requested-resync.
Sep 15 19:34:10 asterisk kernel: [258470.152310] md: using 128k window, 
over a total of 311619448k.
Sep 15 19:34:11 asterisk kernel: [258471.165653] ata3.00: exception 
Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 15 19:34:11 asterisk kernel: [258471.167468] ata3.00: BMDMA stat 0x44
Sep 15 19:34:11 asterisk kernel: [258471.169912] ata3.00: failed 
command: READ DMA EXT
Sep 15 19:34:11 asterisk kernel: [258471.172769] ata3.00: cmd 
25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in
Sep 15 19:34:11 asterisk kernel: [258471.172771]          res 
51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 15 19:34:11 asterisk kernel: [258471.176753] ata3.00: status: { DRDY 
ERR }
Sep 15 19:34:11 asterisk kernel: [258471.178605] ata3.00: error: { UNC }
Sep 15 19:34:12 asterisk kernel: [258472.148217] ata3.00: configured for 
UDMA/133
Sep 15 19:34:12 asterisk kernel: [258472.148232] ata3: EH complete
Sep 15 19:34:13 asterisk kernel: [258473.131054] ata3.00: exception 
Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 15 19:34:13 asterisk kernel: [258473.132881] ata3.00: BMDMA stat 0x44
Sep 15 19:34:13 asterisk kernel: [258473.134639] ata3.00: failed 
command: READ DMA EXT
Sep 15 19:34:13 asterisk kernel: [258473.136413] ata3.00: cmd 
25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in
Sep 15 19:34:13 asterisk kernel: [258473.136415]          res 
51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 15 19:34:13 asterisk kernel: [258473.141768] ata3.00: status: { DRDY 
ERR }
Sep 15 19:34:13 asterisk kernel: [258473.144049] ata3.00: error: { UNC }
Sep 15 19:34:14 asterisk kernel: [258474.112209] ata3.00: configured for 
UDMA/133
Sep 15 19:34:14 asterisk kernel: [258474.112224] ata3: EH complete
Sep 15 19:34:15 asterisk kernel: [258475.071642] ata3.00: exception 
Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 15 19:34:15 asterisk kernel: [258475.073476] ata3.00: BMDMA stat 0x44
Sep 15 19:34:15 asterisk kernel: [258475.075240] ata3.00: failed 
command: READ DMA EXT
Sep 15 19:34:15 asterisk kernel: [258475.077027] ata3.00: cmd 
25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in
Sep 15 19:34:15 asterisk kernel: [258475.077029]          res 
51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 15 19:34:15 asterisk kernel: [258475.080720] ata3.00: status: { DRDY 
ERR }
Sep 15 19:34:15 asterisk kernel: [258475.083512] ata3.00: error: { UNC }
Sep 15 19:34:16 asterisk kernel: [258476.100935] ata3.00: configured for 
UDMA/133
Sep 15 19:34:16 asterisk kernel: [258476.100960] ata3: EH complete
Sep 15 19:41:29 asterisk asterisk[3492]: rc_avpair_new: unknown 
attribute 1490026597
Sep 15 19:41:46 asterisk asterisk[3492]: rc_avpair_new: unknown 
attribute 1490026597
Sep 15 19:41:52 asterisk asterisk[3492]: rc_avpair_new: unknown 
attribute 1490026597
Sep 15 19:42:52 asterisk asterisk[3492]: rc_avpair_new: unknown 
attribute 1490026597
Sep 15 19:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 
Currently unreadable (pending) sectors
Sep 15 19:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline 
uncorrectable sectors
Sep 15 19:50:51 asterisk mdadm[2117]: Rebuild26 event detected on md 
device /dev/md/0
Sep 15 20:07:31 asterisk mdadm[2117]: Rebuild53 event detected on md 
device /dev/md/0
Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 
Currently unreadable (pending) sectors
Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline 
uncorrectable sectors
Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 
Temperature changed +4 Celsius to 42 Celsius (Min/Max 30/46)
Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], SMART 
Usage Attribute: 201 Soft_Read_Error_Rate changed from 99 to 100
Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sdb [SAT], SMART 
Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 60
Sep 15 20:24:11 asterisk mdadm[2117]: Rebuild75 event detected on md 
device /dev/md/0
Sep 15 20:40:51 asterisk mdadm[2117]: Rebuild93 event detected on md 
device /dev/md/0
Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 
Currently unreadable (pending) sectors
Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline 
uncorrectable sectors
Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], SMART 
Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 60
Sep 15 20:47:24 asterisk kernel: [262863.781068] md: md0: 
requested-resync done.
Sep 15 20:47:24 asterisk mdadm[2117]: RebuildFinished event detected on 
md device /dev/md/0



I still get:

Num  Test_Description    Status                  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Offline             Completed: read failure       90%      8985 
      3912

and

197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
       -       2
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age 
Offline      -       1


How is it possible? Next thing I will try is manually failing /dev/sda 
and filling it with zeros. I would like to do a *low level format* but I 
didn't find the utility for my disk :(

Disk is:

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F1 DT
Device Model:     SAMSUNG HD322HJ
Serial Number:    S17AJDWQ402689
LU WWN Device Id: 5 0000f0 003046298
Firmware Version: 1AC01110
User Capacity:    320,072,933,376 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Sat Sep 15 21:02:36 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===



root@asterisk:~# smartctl -a /dev/sda -P show
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-2-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

Drive found in smartmontools Database.  Drive identity strings:
MODEL:              SAMSUNG HD322HJ
FIRMWARE:           1AC01110
match smartmontools Drive Database entry:
MODEL REGEXP:       SAMSUNG 
HD(083G|16[12]G|25[12]H|32[12]H|50[12]I|642J|75[23]L|10[23]U)J
FIRMWARE REGEXP:    .*
MODEL FAMILY:       SAMSUNG SpinPoint F1 DT
ATTRIBUTE OPTIONS:  None preset; no -v options are required.


Thanks,
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-15 19:05               ` Niccolò Belli
@ 2012-09-15 19:41                 ` Robin Hill
  2012-09-15 22:06                   ` Niccolò Belli
  2012-09-16 10:42                   ` Niccolò Belli
  0 siblings, 2 replies; 27+ messages in thread
From: Robin Hill @ 2012-09-15 19:41 UTC (permalink / raw)
  To: Niccolò Belli; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 7309 bytes --]

On Sat Sep 15, 2012 at 09:05:25 +0200, Niccolò Belli wrote:

> CHECK didn't help me, so I did a echo "repair > 
> /sys/block/md0/md/sync_action". REPAIR didn't work too :(
> 
Didn't work for what you were wanting anyway. It may well have worked
for its intended purpose.

> Here is syslog of REPAIR:
> 
> Sep 15 19:34:10 asterisk mdadm[2117]: RebuildStarted event detected on 
> md device /dev/md/0
> Sep 15 19:34:10 asterisk kernel: [258470.152296] md: requested-resync of 
> RAID array md0
> Sep 15 19:34:10 asterisk kernel: [258470.152301] md: minimum 
> _guaranteed_  speed: 1000 KB/sec/disk.
> Sep 15 19:34:10 asterisk kernel: [258470.152304] md: using maximum 
> available idle IO bandwidth (but not more than 200000 KB/sec) for 
> requested-resync.
> Sep 15 19:34:10 asterisk kernel: [258470.152310] md: using 128k window, 
> over a total of 311619448k.
> Sep 15 19:34:11 asterisk kernel: [258471.165653] ata3.00: exception 
> Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> Sep 15 19:34:11 asterisk kernel: [258471.167468] ata3.00: BMDMA stat 0x44
> Sep 15 19:34:11 asterisk kernel: [258471.169912] ata3.00: failed 
> command: READ DMA EXT
> Sep 15 19:34:11 asterisk kernel: [258471.172769] ata3.00: cmd 
> 25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in
> Sep 15 19:34:11 asterisk kernel: [258471.172771]          res 
> 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
> Sep 15 19:34:11 asterisk kernel: [258471.176753] ata3.00: status: { DRDY 
> ERR }
> Sep 15 19:34:11 asterisk kernel: [258471.178605] ata3.00: error: { UNC }
> Sep 15 19:34:12 asterisk kernel: [258472.148217] ata3.00: configured for 
> UDMA/133
> Sep 15 19:34:12 asterisk kernel: [258472.148232] ata3: EH complete
> Sep 15 19:34:13 asterisk kernel: [258473.131054] ata3.00: exception 
> Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> Sep 15 19:34:13 asterisk kernel: [258473.132881] ata3.00: BMDMA stat 0x44
> Sep 15 19:34:13 asterisk kernel: [258473.134639] ata3.00: failed 
> command: READ DMA EXT
> Sep 15 19:34:13 asterisk kernel: [258473.136413] ata3.00: cmd 
> 25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in
> Sep 15 19:34:13 asterisk kernel: [258473.136415]          res 
> 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
> Sep 15 19:34:13 asterisk kernel: [258473.141768] ata3.00: status: { DRDY 
> ERR }
> Sep 15 19:34:13 asterisk kernel: [258473.144049] ata3.00: error: { UNC }
> Sep 15 19:34:14 asterisk kernel: [258474.112209] ata3.00: configured for 
> UDMA/133
> Sep 15 19:34:14 asterisk kernel: [258474.112224] ata3: EH complete
> Sep 15 19:34:15 asterisk kernel: [258475.071642] ata3.00: exception 
> Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> Sep 15 19:34:15 asterisk kernel: [258475.073476] ata3.00: BMDMA stat 0x44
> Sep 15 19:34:15 asterisk kernel: [258475.075240] ata3.00: failed 
> command: READ DMA EXT
> Sep 15 19:34:15 asterisk kernel: [258475.077027] ata3.00: cmd 
> 25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in
> Sep 15 19:34:15 asterisk kernel: [258475.077029]          res 
> 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
> Sep 15 19:34:15 asterisk kernel: [258475.080720] ata3.00: status: { DRDY 
> ERR }
> Sep 15 19:34:15 asterisk kernel: [258475.083512] ata3.00: error: { UNC }
> Sep 15 19:34:16 asterisk kernel: [258476.100935] ata3.00: configured for 
> UDMA/133
> Sep 15 19:34:16 asterisk kernel: [258476.100960] ata3: EH complete
> Sep 15 19:41:29 asterisk asterisk[3492]: rc_avpair_new: unknown 
> attribute 1490026597
> Sep 15 19:41:46 asterisk asterisk[3492]: rc_avpair_new: unknown 
> attribute 1490026597
> Sep 15 19:41:52 asterisk asterisk[3492]: rc_avpair_new: unknown 
> attribute 1490026597
> Sep 15 19:42:52 asterisk asterisk[3492]: rc_avpair_new: unknown 
> attribute 1490026597
> Sep 15 19:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 
> Currently unreadable (pending) sectors
> Sep 15 19:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline 
> uncorrectable sectors
> Sep 15 19:50:51 asterisk mdadm[2117]: Rebuild26 event detected on md 
> device /dev/md/0
> Sep 15 20:07:31 asterisk mdadm[2117]: Rebuild53 event detected on md 
> device /dev/md/0
> Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 
> Currently unreadable (pending) sectors
> Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline 
> uncorrectable sectors
> Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 
> Temperature changed +4 Celsius to 42 Celsius (Min/Max 30/46)
> Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], SMART 
> Usage Attribute: 201 Soft_Read_Error_Rate changed from 99 to 100
> Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sdb [SAT], SMART 
> Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 60
> Sep 15 20:24:11 asterisk mdadm[2117]: Rebuild75 event detected on md 
> device /dev/md/0
> Sep 15 20:40:51 asterisk mdadm[2117]: Rebuild93 event detected on md 
> device /dev/md/0
> Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 
> Currently unreadable (pending) sectors
> Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline 
> uncorrectable sectors
> Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], SMART 
> Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 60
> Sep 15 20:47:24 asterisk kernel: [262863.781068] md: md0: 
> requested-resync done.
> Sep 15 20:47:24 asterisk mdadm[2117]: RebuildFinished event detected on 
> md device /dev/md/0
> 
> 
Okay, so the drive logs an exception at 19:34:11, then completes its
error handling at 19:34:16.

If md hasn't failed the drive then either:
  - md didn't get a read error
  - md got a success message when re-writing the block
  - there's a bug in md and it's not handled the error at all

My guess would be on one of the first two (I'm not sure what's logged if
md gets a read error and does a re-write).

> 
> I still get:
> 
> Num  Test_Description    Status                  Remaining 
> LifeTime(hours)  LBA_of_first_error
> # 1  Offline             Completed: read failure       90%      8985 
>       3912
> 
> and
> 
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
>        -       2
> 198 Offline_Uncorrectable   0x0030   100   100   000    Old_age 
> Offline      -       1
> 
> 
> How is it possible? Next thing I will try is manually failing /dev/sda 
> and filling it with zeros. I would like to do a *low level format* but I 
> didn't find the utility for my disk :(
> 
I'm pretty sure there's no such thing as a *low level format* for any
modern disk (or not one that does anything more than writing a known
pattern to the disk). The low-level information is far too precisely
laid out for the disk heads to be able to write.

Writing zeros is certainly what I'd do in this situation - I've done it
for several drives in the past where they've had offline uncorrectable
sectors flagged.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-15 19:41                 ` Robin Hill
@ 2012-09-15 22:06                   ` Niccolò Belli
  2012-09-16 10:18                     ` Robin Hill
  2012-09-16 10:42                   ` Niccolò Belli
  1 sibling, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-15 22:06 UTC (permalink / raw)
  To: linux-raid

Il 15/09/2012 21:41, Robin Hill ha scritto:
> If md hasn't failed the drive then either:
>    - md didn't get a read error
>    - md got a success message when re-writing the block
>    - there's a bug in md and it's not handled the error at all

It seems it's case one, while manually verifying the checksums with

for i in $(seq 50); do dd if=/dev/sda1 of=sda${i} bs=100000 count=50 
skip=$((($i-1)*50+10)) > /dev/null 2> /dev/null; dd if=/dev/sdb1 
of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+10)) > /dev/null 2> 
/dev/null; md5sum sda${i}; md5sum sdb${i}; echo; done

I get this in syslog:

Sep 15 23:50:09 asterisk kernel: [273828.407914] scsi_verify_blk_ioctl: 
30 callbacks suppressed
Sep 15 23:50:09 asterisk kernel: [273828.407920] dd: sending ioctl 
80306d02 to a partition!
Sep 15 23:50:09 asterisk kernel: [273828.407925] dd: sending ioctl 
80306d02 to a partition!
Sep 15 23:50:10 asterisk kernel: [273829.422247] ata3.00: exception 
Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 15 23:50:10 asterisk kernel: [273829.424071] ata3.00: BMDMA stat 0x44
Sep 15 23:50:10 asterisk kernel: [273829.425855] ata3.00: failed 
command: READ DMA
Sep 15 23:50:10 asterisk kernel: [273829.427625] ata3.00: cmd 
c8/00:00:68:17:00/00:00:00:00:00/e0 tag 0 dma 131072 in
Sep 15 23:50:10 asterisk kernel: [273829.427627]          res 
51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
Sep 15 23:50:10 asterisk kernel: [273829.431184] ata3.00: status: { DRDY 
ERR }
Sep 15 23:50:10 asterisk kernel: [273829.432992] ata3.00: error: { UNC }
Sep 15 23:50:11 asterisk kernel: [273830.404203] ata3.00: configured for 
UDMA/133
Sep 15 23:50:11 asterisk kernel: [273830.404217] ata3: EH complete



but this is the output of the command:


b7d4e3c3bb461a1aa6619c22ef11d072  sda1
b7d4e3c3bb461a1aa6619c22ef11d072  sdb1

8649ae5a732bc808f228677b27a1e9b6  sda2
8649ae5a732bc808f228677b27a1e9b6  sdb2

8649ae5a732bc808f228677b27a1e9b6  sda3
8649ae5a732bc808f228677b27a1e9b6  sdb3

8649ae5a732bc808f228677b27a1e9b6  sda4
8649ae5a732bc808f228677b27a1e9b6  sdb4

8649ae5a732bc808f228677b27a1e9b6  sda5
8649ae5a732bc808f228677b27a1e9b6  sdb5

8649ae5a732bc808f228677b27a1e9b6  sda6
8649ae5a732bc808f228677b27a1e9b6  sdb6

8649ae5a732bc808f228677b27a1e9b6  sda7
8649ae5a732bc808f228677b27a1e9b6  sdb7

f2fb77841db5dd577449cfeee07c4108  sda8
f2fb77841db5dd577449cfeee07c4108  sdb8

e311789a1fabd3758694c35c74e20612  sda9
e311789a1fabd3758694c35c74e20612  sdb9

8649ae5a732bc808f228677b27a1e9b6  sda10
8649ae5a732bc808f228677b27a1e9b6  sdb10

8649ae5a732bc808f228677b27a1e9b6  sda11
8649ae5a732bc808f228677b27a1e9b6  sdb11

8649ae5a732bc808f228677b27a1e9b6  sda12
8649ae5a732bc808f228677b27a1e9b6  sdb12

8649ae5a732bc808f228677b27a1e9b6  sda13
8649ae5a732bc808f228677b27a1e9b6  sdb13

8649ae5a732bc808f228677b27a1e9b6  sda14
8649ae5a732bc808f228677b27a1e9b6  sdb14

8649ae5a732bc808f228677b27a1e9b6  sda15
8649ae5a732bc808f228677b27a1e9b6  sdb15

8649ae5a732bc808f228677b27a1e9b6  sda16
8649ae5a732bc808f228677b27a1e9b6  sdb16

8649ae5a732bc808f228677b27a1e9b6  sda17
8649ae5a732bc808f228677b27a1e9b6  sdb17

8649ae5a732bc808f228677b27a1e9b6  sda18
8649ae5a732bc808f228677b27a1e9b6  sdb18

8649ae5a732bc808f228677b27a1e9b6  sda19
8649ae5a732bc808f228677b27a1e9b6  sdb19

8649ae5a732bc808f228677b27a1e9b6  sda20
8649ae5a732bc808f228677b27a1e9b6  sdb20

8649ae5a732bc808f228677b27a1e9b6  sda21
8649ae5a732bc808f228677b27a1e9b6  sdb21

8649ae5a732bc808f228677b27a1e9b6  sda22
8649ae5a732bc808f228677b27a1e9b6  sdb22

8649ae5a732bc808f228677b27a1e9b6  sda23
8649ae5a732bc808f228677b27a1e9b6  sdb23

8649ae5a732bc808f228677b27a1e9b6  sda24
8649ae5a732bc808f228677b27a1e9b6  sdb24

8649ae5a732bc808f228677b27a1e9b6  sda25
8649ae5a732bc808f228677b27a1e9b6  sdb25

8649ae5a732bc808f228677b27a1e9b6  sda26
8649ae5a732bc808f228677b27a1e9b6  sdb26

4531da1579310425e2d3343846f5b16d  sda27
4531da1579310425e2d3343846f5b16d  sdb27

3721bf34547dc2967741bf6bfbd76670  sda28
3721bf34547dc2967741bf6bfbd76670  sdb28

14a2be518f90d3060b3438ac75d91e7e  sda29
14a2be518f90d3060b3438ac75d91e7e  sdb29

36fb275af7608d0aff8c7b454168f8c3  sda30
36fb275af7608d0aff8c7b454168f8c3  sdb30

2026b2cf40470f059d264b2c78f3a989  sda31
2026b2cf40470f059d264b2c78f3a989  sdb31

36f825d926a6195c70efabd0a045fce0  sda32
36f825d926a6195c70efabd0a045fce0  sdb32

44be6fdd8adb83f1328d6fa21e72a5f9  sda33
44be6fdd8adb83f1328d6fa21e72a5f9  sdb33

90a771705992c1ba15c17a30520b0b56  sda34
90a771705992c1ba15c17a30520b0b56  sdb34

c37584adcad03dc74b0ea9e431fd78e3  sda35
c37584adcad03dc74b0ea9e431fd78e3  sdb35

f044f24e528316cf5a40e894e7d84c36  sda36
f044f24e528316cf5a40e894e7d84c36  sdb36

4447d6a338fdac8cf179dde83deb7f43  sda37
4447d6a338fdac8cf179dde83deb7f43  sdb37

b4115994e66cb739dc49fedcaf5649eb  sda38
b4115994e66cb739dc49fedcaf5649eb  sdb38

65c9226105cbba0fd7dbefb9bedac940  sda39
65c9226105cbba0fd7dbefb9bedac940  sdb39

e05366f8be4b66595c2aadbb133c6b4c  sda40
e05366f8be4b66595c2aadbb133c6b4c  sdb40

afc039520def52590a5fd289b423545a  sda41
afc039520def52590a5fd289b423545a  sdb41

6d47c3b1265afc3dbbd832d8088501c4  sda42
6d47c3b1265afc3dbbd832d8088501c4  sdb42

749140fe9a80f20dd5449976db66ce0f  sda43
749140fe9a80f20dd5449976db66ce0f  sdb43

41bd354c1cca819dd4a8d19b8c1a637e  sda44
41bd354c1cca819dd4a8d19b8c1a637e  sdb44

b2fc15b0147853d76a7c5fe87820d26b  sda45
b2fc15b0147853d76a7c5fe87820d26b  sdb45

a9b3ac7ac3556950887959dea3b6ae3c  sda46
a9b3ac7ac3556950887959dea3b6ae3c  sdb46

3daf2ee98c1d3d24f779234f6f7d58d6  sda47
3daf2ee98c1d3d24f779234f6f7d58d6  sdb47

31fe58f24393d199b63102a45b8b44c3  sda48
31fe58f24393d199b63102a45b8b44c3  sdb48

43e0657b350cd60efdf1ca0c8324f85c  sda49
43e0657b350cd60efdf1ca0c8324f85c  sdb49

94f883b45084b72cd9269a4821b2d509  sda50
94f883b45084b72cd9269a4821b2d509  sdb50



*BUT* if I start reading from the start of partition (+0 instead of +10 
in count=) I get a mismatch, on both md0 and md1 (which is supposed to 
be ok)!!!

root@asterisk:~# i=1; dd if=/dev/sda1 of=sda${i} bs=100000 count=50 
skip=$((($i-1)*50+0)) > /dev/null 2> /dev/null; dd if=/dev/sdb1 
of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+0)) > /dev/null 2> 
/dev/null; md5sum sda${i}; md5sum sdb${i}
9f9f11ffeb0aed0abc8097417b293f41  sda1
394efde218ad700774bfcb3c43255529  sdb1
root@asterisk:~# i=1; dd if=/dev/sda2 of=sda${i} bs=100000 count=50 
skip=$((($i-1)*50+0)) > /dev/null 2> /dev/null; dd if=/dev/sdb2 
of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+0)) > /dev/null 2> 
/dev/null; md5sum sda${i}; md5sum sdb${i}
8cb0b6fa2bf7f0f88a2a2a91598429d4  sda1
732c42e14b8e78930d08cdb4f1c49a40  sdb1

Shouldn't raid1 match even at the very beginning of the partition?


Il 15/09/2012 22:40, Roberto Spadim ha scritto:
 > today disks arent expensives, why not change the disk and be happy?

Because I get the problem after a power failure, disk *should* be ok I 
think.

Cheers,
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-15 22:06                   ` Niccolò Belli
@ 2012-09-16 10:18                     ` Robin Hill
  0 siblings, 0 replies; 27+ messages in thread
From: Robin Hill @ 2012-09-16 10:18 UTC (permalink / raw)
  To: Niccolò Belli; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3653 bytes --]

On Sun Sep 16, 2012 at 12:06:48 +0200, Niccolò Belli wrote:

> Il 15/09/2012 21:41, Robin Hill ha scritto:
> > If md hasn't failed the drive then either:
> >    - md didn't get a read error
> >    - md got a success message when re-writing the block
> >    - there's a bug in md and it's not handled the error at all
> 
> It seems it's case one, while manually verifying the checksums with
> 
> for i in $(seq 50); do dd if=/dev/sda1 of=sda${i} bs=100000 count=50 
> skip=$((($i-1)*50+10)) > /dev/null 2> /dev/null; dd if=/dev/sdb1 
> of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+10)) > /dev/null 2> 
> /dev/null; md5sum sda${i}; md5sum sdb${i}; echo; done
> 
> I get this in syslog:
> 
> Sep 15 23:50:09 asterisk kernel: [273828.407914] scsi_verify_blk_ioctl: 
> 30 callbacks suppressed
> Sep 15 23:50:09 asterisk kernel: [273828.407920] dd: sending ioctl 
> 80306d02 to a partition!
> Sep 15 23:50:09 asterisk kernel: [273828.407925] dd: sending ioctl 
> 80306d02 to a partition!
> Sep 15 23:50:10 asterisk kernel: [273829.422247] ata3.00: exception 
> Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> Sep 15 23:50:10 asterisk kernel: [273829.424071] ata3.00: BMDMA stat 0x44
> Sep 15 23:50:10 asterisk kernel: [273829.425855] ata3.00: failed 
> command: READ DMA
> Sep 15 23:50:10 asterisk kernel: [273829.427625] ata3.00: cmd 
> c8/00:00:68:17:00/00:00:00:00:00/e0 tag 0 dma 131072 in
> Sep 15 23:50:10 asterisk kernel: [273829.427627]          res 
> 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
> Sep 15 23:50:10 asterisk kernel: [273829.431184] ata3.00: status: { DRDY 
> ERR }
> Sep 15 23:50:10 asterisk kernel: [273829.432992] ata3.00: error: { UNC }
> Sep 15 23:50:11 asterisk kernel: [273830.404203] ata3.00: configured for 
> UDMA/133
> Sep 15 23:50:11 asterisk kernel: [273830.404217] ata3: EH complete
> 
> 
> 
> but this is the output of the command:
> 
> 
> b7d4e3c3bb461a1aa6619c22ef11d072  sda1
> b7d4e3c3bb461a1aa6619c22ef11d072  sdb1
>
<- snip sets of identical checksums ->
>
> 94f883b45084b72cd9269a4821b2d509  sda50
> 94f883b45084b72cd9269a4821b2d509  sdb50
> 
Okay, so it looks like the drive is managing to return the correct data
eventually (or it's returning some default value which has also been
written to the other mirror now).

> *BUT* if I start reading from the start of partition (+0 instead of +10 
> in count=) I get a mismatch, on both md0 and md1 (which is supposed to 
> be ok)!!!
> 
> root@asterisk:~# i=1; dd if=/dev/sda1 of=sda${i} bs=100000 count=50 
> skip=$((($i-1)*50+0)) > /dev/null 2> /dev/null; dd if=/dev/sdb1 
> of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+0)) > /dev/null 2> 
> /dev/null; md5sum sda${i}; md5sum sdb${i}
> 9f9f11ffeb0aed0abc8097417b293f41  sda1
> 394efde218ad700774bfcb3c43255529  sdb1
> root@asterisk:~# i=1; dd if=/dev/sda2 of=sda${i} bs=100000 count=50 
> skip=$((($i-1)*50+0)) > /dev/null 2> /dev/null; dd if=/dev/sdb2 
> of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+0)) > /dev/null 2> 
> /dev/null; md5sum sda${i}; md5sum sdb${i}
> 8cb0b6fa2bf7f0f88a2a2a91598429d4  sda1
> 732c42e14b8e78930d08cdb4f1c49a40  sdb1
> 
> Shouldn't raid1 match even at the very beginning of the partition?
> 
No, the start of the partition will contain the md superblock (for 1.1
and 1.2 metadata formats), which will be slightly different for the two
devices.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-15 19:41                 ` Robin Hill
  2012-09-15 22:06                   ` Niccolò Belli
@ 2012-09-16 10:42                   ` Niccolò Belli
  2012-09-16 15:26                     ` Chris Murphy
  1 sibling, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-16 10:42 UTC (permalink / raw)
  To: linux-raid

Il 15/09/2012 21:41, Robin Hill ha scritto:
> Writing zeros is certainly what I'd do in this situation - I've done it
> for several drives in the past where they've had offline uncorrectable
> sectors flagged.

I just tried to write zeros, it didn't help: the disk doesn't reallocate 
the bad sector :(
-- 
http://www.linuxsystems.it

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-16 10:42                   ` Niccolò Belli
@ 2012-09-16 15:26                     ` Chris Murphy
  2012-09-16 15:31                       ` Niccolò Belli
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Murphy @ 2012-09-16 15:26 UTC (permalink / raw)
  To: Linux RAID

On Sep 16, 2012, at 4:42 AM, Niccolò Belli wrote:

> Il 15/09/2012 21:41, Robin Hill ha scritto:
>> Writing zeros is certainly what I'd do in this situation - I've done it
>> for several drives in the past where they've had offline uncorrectable
>> sectors flagged.
> 
> I just tried to write zeros, it didn't help: the disk doesn't reallocate the bad sector :(

Something isn't right. How did you write zeros?

I went through the archives and wasn't able to find the full smartctl -x results for this drive, can you post them?

Does anyone know for sure if the ATA Secure Erase command verifies its writes? i.e. does it even have a way of knowing if there are bad sectors on a write and remove them from use? Or is the write-read verification always occurring on hard drives?

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-16 15:26                     ` Chris Murphy
@ 2012-09-16 15:31                       ` Niccolò Belli
  2012-09-16 23:35                         ` Niccolò Belli
  0 siblings, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-16 15:31 UTC (permalink / raw)
  To: linux-raid

Il 16/09/2012 17:26, Chris Murphy ha scritto:
> Something isn't right. How did you write zeros?

dd if=/dev/zero of=/dev/sda


> I went through the archives and wasn't able to find the full smartctl -x results for this drive, can you post them?

root@asterisk:~# smartctl -x /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-2-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F1 DT
Device Model:     SAMSUNG HD322HJ
Serial Number:    S17AJDWQ402689
LU WWN Device Id: 5 0000f0 003046298
Firmware Version: 1AC01110
User Capacity:    320,072,933,376 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Sun Sep 16 17:29:50 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x06) Offline data collection activity
                                         was aborted by the device with 
a fatal error.
                                         Auto Offline Data Collection: 
Disabled.
Self-test execution status:      ( 114) The previous self-test completed 
having
                                         the read element of the test 
failed.
Total time to complete Offline
data collection:                ( 3888) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                         Auto Offline data collection 
on/off support.
                                         Suspend Offline collection upon new
                                         command.
                                         Offline surface scan supported.
                                         Self-test supported.
                                         Conveyance Self-test supported.
                                         Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                         power-saving mode.
                                         Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                         General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  66) minutes.
Conveyance self-test routine
recommended polling time:        (   8) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                         SCT Error Recovery Control 
supported.
                                         SCT Feature Control supported.
                                         SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
   1 Raw_Read_Error_Rate     POSR--   099   099   051    -    712
   3 Spin_Up_Time            POS---   094   094   011    -    2810
   4 Start_Stop_Count        -O--CK   099   099   000    -    1077
   5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
   7 Seek_Error_Rate         POSR--   253   253   051    -    0
   8 Seek_Time_Performance   P-S--K   100   100   015    -    9508
   9 Power_On_Hours          -O--CK   098   098   000    -    9006
  10 Spin_Retry_Count        PO--CK   100   100   051    -    0
  11 Calibration_Retry_Count -O--C-   100   100   000    -    0
  12 Power_Cycle_Count       -O--CK   099   099   000    -    1077
  13 Read_Soft_Error_Rate    -OSR--   099   099   000    -    654
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        PO--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    908
188 Command_Timeout         -O--CK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   063   055   000    -    37 (Min/Max 
28/45)
194 Temperature_Celsius     -O---K   063   054   000    -    37 (Min/Max 
28/46)
195 Hardware_ECC_Recovered  -O-RC-   100   100   000    -    988053162
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O--C-   100   100   000    -    3
198 Offline_Uncorrectable   ----CK   100   100   000    -    1
199 UDMA_CRC_Error_Count    -OSRCK   100   100   000    -    0
200 Multi_Zone_Error_Rate   -O-R--   100   100   000    -    0
201 Soft_Read_Error_Rate    -O-R--   095   095   000    -    440
                             ||||||_ K auto-keep
                             |||||__ C event count
                             ||||___ R error rate
                             |||____ S speed/performance
                             ||_____ O updated online
                             |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
GP/S  Log at address 0x00 has    1 sectors [Log Directory]
SMART Log at address 0x01 has    1 sectors [Summary SMART error log]
SMART Log at address 0x02 has    2 sectors [Comprehensive SMART error log]
GP    Log at address 0x03 has    2 sectors [Ext. Comprehensive SMART 
error log]
SMART Log at address 0x06 has    1 sectors [SMART self-test log]
GP    Log at address 0x07 has    2 sectors [Extended self-test log]
SMART Log at address 0x09 has    1 sectors [Selective self-test log]
GP    Log at address 0x10 has    1 sectors [NCQ Command Error]
GP    Log at address 0x11 has    1 sectors [SATA Phy Event Counters]
GP/S  Log at address 0x80 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x81 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x82 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x83 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x84 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x85 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x86 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x87 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x88 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x89 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x8a has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x8b has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x8c has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x8d has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x8e has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x8f has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x90 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x91 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x92 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x93 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x94 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x95 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x96 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x97 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x98 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x99 has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x9a has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x9b has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x9c has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x9d has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x9e has   16 sectors [Host vendor specific log]
GP/S  Log at address 0x9f has   16 sectors [Host vendor specific log]
GP/S  Log at address 0xe0 has    1 sectors [SCT Command/Status]
GP/S  Log at address 0xe1 has    1 sectors [SCT Data Transfer]

SMART Extended Comprehensive Error Log Version: 1 (2 sectors)
Device Error Count: 450 (device log contains only the most recent 8 errors)
         CR     = Command Register
         FEATR  = Features Register
         COUNT  = Count (was: Sector Count) Register
         LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
         LH     = LBA High (was: Cylinder High) Register    ]   LBA
         LM     = LBA Mid (was: Cylinder Low) Register      ] Register
         LL     = LBA Low (was: Sector Number) Register     ]
         DV     = Device (was: Device/Head) Register
         DC     = Device Control Register
         ER     = Error register
         ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 450 [1] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:03:29.664  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:29.664  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:03:29.654  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:03:29.654  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:29.654  READ NATIVE 
MAX ADDRESS EXT

Error 449 [0] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:03:27.714  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:27.714  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:03:27.714  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:03:27.714  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:27.714  READ NATIVE 
MAX ADDRESS EXT

Error 448 [7] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:03:25.774  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:25.774  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:03:25.774  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:03:25.774  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:25.764  READ NATIVE 
MAX ADDRESS EXT

Error 447 [6] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:03:23.804  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:23.804  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:03:23.794  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:03:23.794  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:23.794  READ NATIVE 
MAX ADDRESS EXT

Error 446 [5] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:03:21.824  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:21.824  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:03:21.814  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:03:21.814  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:03:21.814  READ NATIVE 
MAX ADDRESS EXT

Error 445 [4] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:03:20.254  READ DMA
   c8 00 00 00 08 00 00 00 00 0f 40 e0 08 21d+23:03:20.254  READ DMA
   c8 00 00 00 08 00 00 00 00 0f 38 e0 08 21d+23:03:20.254  READ DMA
   c8 00 00 00 08 00 00 00 00 0f 30 e0 08 21d+23:03:20.254  READ DMA
   c8 00 00 00 08 00 00 00 00 0f 28 e0 08 21d+23:03:20.254  READ DMA

Error 444 [3] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:02:10.594  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:02:10.594  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:02:10.594  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:02:10.594  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:02:10.594  READ NATIVE 
MAX ADDRESS EXT

Error 443 [2] occurred at disk power-on lifetime: 9001 hours (375 days + 
1 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER -- ST COUNT  LBA_48  LH LM LL DV DC
   -- -- -- == -- == == == -- -- -- -- --
   40 -- 51 00 00 00 00 00 00 0f 48 e0 00  Error: UNC at LBA = 
0x00000f48 = 3912

   Commands leading to the command that caused the error were:
   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
Command/Feature_Name
   -- == -- == -- == == == -- -- -- -- --  --------------- 
--------------------
   c8 00 00 00 08 00 00 00 00 0f 48 e0 08 21d+23:02:08.654  READ DMA
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:02:08.654  READ NATIVE 
MAX ADDRESS EXT
   ec 00 00 00 00 00 00 00 00 00 00 a0 08 21d+23:02:08.654  IDENTIFY DEVICE
   ef 00 03 00 46 00 00 00 00 00 00 a0 08 21d+23:02:08.654  SET FEATURES 
[Set transfer mode]
   27 00 00 00 00 00 00 00 00 00 00 e0 08 21d+23:02:08.654  READ NATIVE 
MAX ADDRESS EXT

SMART Extended Self-test Log Version: 0 (2 sectors)
Num  Test_Description    Status                  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       20%      8991 
      3912
# 2  Offline             Aborted by host               90%      8985 
      -
# 3  Offline             Aborted by host               90%      8981 
      -
# 4  Offline             Aborted by host               90%      8981 
      -
# 5  Extended offline    Aborted by host               90%      8980 
      -
# 6  Extended offline    Aborted by host               90%      8980 
      -
# 7  Short offline       Aborted by host               20%      8980 
      -
# 8  Short offline       Aborted by host               20%      8980 
      -
# 9  Extended offline    Aborted by host               90%      8968 
      -
#10  Short offline       Aborted by host               20%      8967 
      -
#11  Short offline       Aborted by host               20%      8943 
      -
#12  Short offline       Aborted by host               20%      8919 
      -
#13  Short offline       Aborted by host               20%      8895 
      -
#14  Short offline       Aborted by host               20%      8871 
      -
#15  Short offline       Aborted by host               20%      8847 
      -
#16  Short offline       Aborted by host               20%      8823 
      -
#17  Extended offline    Aborted by host               90%      8800 
      -
#18  Short offline       Aborted by host               20%      8799 
      -
#19  Short offline       Aborted by host               20%      8775 
      -
#20  Short offline       Aborted by host               20%      8751 
      -
#21  Short offline       Aborted by host               20%      8727 
      -

Note: selective self-test log revision number (0) not 1 implies that no 
selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever 
been run
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing
Selective self-test flags (0x0):
   After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  2
SCT Version (vendor specific):       256 (0x0100)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                 37 Celsius
Power Cycle Max Temperature:         46 Celsius
Lifetime    Max Temperature:         46 Celsius
SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:     -4/72 Celsius
Min/Max Temperature Limit:           -9/77 Celsius
Temperature History Size (Index):    128 (36)

Index    Estimated Time   Temperature Celsius
   37    2012-09-16 15:22    37  ******************
  ...    ..(126 skipped).    ..  ******************
   36    2012-09-16 17:29    37  ******************

SCT Error Recovery Control:
            Read: Disabled
           Write: Disabled

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2           24  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2           32  Transition from drive PhyRdy to drive PhyNRdy
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0010  2            0  R_ERR response for host-to-device data FIS, non-CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x0013  2            0  R_ERR response for host-to-device non-data FIS, 
non-CRC

-- 
http://www.linuxsystems.it

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-16 15:31                       ` Niccolò Belli
@ 2012-09-16 23:35                         ` Niccolò Belli
  2012-09-17  0:00                           ` Chris Murphy
  0 siblings, 1 reply; 27+ messages in thread
From: Niccolò Belli @ 2012-09-16 23:35 UTC (permalink / raw)
  To: linux-raid

I finally managed to reallocate the sectors!

I tried with sg3-utils but I get:

root@asterisk:~# sg_reassign --address=3912 /dev/sda
REASSIGN BLOCKS not supported

then I read this on the smartmontools mailing list:

<<
Possibly what is happening is that because he is only writing a partial
block, the OS is first trying to read the the original block so that it
can preserve the parts that won't be changing. When this operation fails,
it blocks the write that would trigger reallocation of the bad sector.
Writing using the OS blocksize (typically 4096 on linux systems) properly
aligned should work around that issue.
 >>

so I tried with

dd if=/dev/zero of=/dev/sda bs=4096

and ta-daaa! :D

197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
       -       *0*
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age 
Offline      -       1


Current_Pending_Sector are gone!


with a smartctl -t offline /dev/sda I removed the Offline_Uncorrectable too:

197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always 
       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age 
Offline      -       *0*




For the sake of google:

     Reallocated_Event_Count

     This is how many sectors have already been reallocated on the 
drive. We're hoping to get the hard disk to increase this number!

     Current_Pending_Sector

     The number of sectors that the drive thinks are dodgy. Bear in mind 
sometimes drives change their mind about whether a sector is bad or not 
- so this number can go down without a reallocation occuring.

     Offline_Uncorrectable

     This is the number of sectors that the drive has attempted to 
correct itself, but failed. Running the command:

     smartctl -t offline /dev/hda

     should cause the drive to test the sectors and attempt to fix them. 
Not all drives support this though.




Thanks for helping!
Niccolò
-- 
http://www.linuxsystems.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-16 23:35                         ` Niccolò Belli
@ 2012-09-17  0:00                           ` Chris Murphy
  2012-09-17  0:03                             ` Niccolò Belli
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Murphy @ 2012-09-17  0:00 UTC (permalink / raw)
  To: Linux RAID


On Sep 16, 2012, at 5:35 PM, Niccolò Belli wrote:
> 
> then I read this on the smartmontools mailing list:
> 
> <<
> Possibly what is happening is that because he is only writing a partial
> block, the OS is first trying to read the the original block so that it
> can preserve the parts that won't be changing. When this operation fails,
> it blocks the write that would trigger reallocation of the bad sector.
> Writing using the OS blocksize (typically 4096 on linux systems) properly
> aligned should work around that issue.
> >>
> 
> so I tried with
> 
> dd if=/dev/zero of=/dev/sda bs=4096
> 
> and ta-daaa!

Useful info. Obviously Secure Erasing a disk takes a while and is overkill for just one sector. But I'm still curious if anyone knows if for sure ATA Secure Erase will remove bad sectors from use.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-17  0:00                           ` Chris Murphy
@ 2012-09-17  0:03                             ` Niccolò Belli
  0 siblings, 0 replies; 27+ messages in thread
From: Niccolò Belli @ 2012-09-17  0:03 UTC (permalink / raw)
  To: linux-raid

Il 17/09/2012 02:00, Chris Murphy ha scritto:
> Obviously Secure Erasing a disk takes a while and is overkill for just one sector. But I'm still curious if anyone knows if for sure ATA Secure Erase will remove bad sectors from use.

It was next on list, (un?)fortunately the trick did work and I had no 
way to test it :)

-- 
http://www.linuxsystems.it

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: raid1 issue after disk failure: both disks of the array are still active
  2012-09-14  7:16     ` Mikael Abrahamsson
  2012-09-14  7:45       ` Niccolò Belli
@ 2012-09-14  8:13       ` NeilBrown
  1 sibling, 0 replies; 27+ messages in thread
From: NeilBrown @ 2012-09-14  8:13 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Chris Murphy, Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]

On Fri, 14 Sep 2012 09:16:20 +0200 (CEST) Mikael Abrahamsson
<swmike@swm.pp.se> wrote:

> On Thu, 13 Sep 2012, Chris Murphy wrote:
> 
> > "check" records errors, no action is taken by the md driver to correct 
> > it, although the disk firmware itself may try reallocation. So far, that 
> > appears to not be the case.
> >
> > "repair" causes the md driver to write correct data (from copy or 
> > reconstructed from parity), which should force the disk firmware to 
> > reallocate the affected LBAs from bad physical sectors to good ones.
> >
> > It seems in this case "repair" is indicated.
> 
> I was under the impression that "check" would check if all data blocks and 
> parity are correct, and record if there is a parity mismatch. This would 
> then be corrected by using "repair" at a later time.
> 
> I was also under the impression that if there was a read error on a drive 
> during "check", that read error would be corrected using parity because 
> it's obviously a hard error, not a logical error.

Both of your impressions are correct.

NeilBrown

> 
> Could you (or someone else) please confirm that my impression is wrong and 
> if there indeed is a hard read error using "check", this will not be 
> corrected? I would be interested in knowing why this decision was taken to 
> have this behaviour, as I feel that if there is a hard read error, this 
> should always be corrected using parity.
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2012-09-17  0:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-13 10:01 raid1 issue after disk failure: both disks of the array are still active Niccolò Belli
2012-09-13 10:34 ` Robin Hill
2012-09-13 10:46   ` Niccolò Belli
     [not found]     ` <5051BBC3.4050805@websitemanagers.com.au>
2012-09-13 11:29       ` Niccolò Belli
     [not found]     ` <CABYL=TpKD2B0vwTrHH=iFK3PcMWueEsi84ACRbBQkDXuiWG3kw@mail.gmail.com>
2012-09-13 15:32       ` Roberto Spadim
2012-09-13 15:48         ` Niccolò Belli
2012-09-13 15:53           ` Roberto Spadim
2012-09-14  7:54             ` Niccolò Belli
2012-09-13 17:02   ` Chris Murphy
2012-09-13 17:39     ` Roberto Spadim
2012-09-13 20:13       ` Chris Murphy
2012-09-14  7:16     ` Mikael Abrahamsson
2012-09-14  7:45       ` Niccolò Belli
2012-09-14 18:04         ` Chris Murphy
2012-09-14 18:27           ` Robin Hill
2012-09-14 18:53             ` Chris Murphy
2012-09-15 19:05               ` Niccolò Belli
2012-09-15 19:41                 ` Robin Hill
2012-09-15 22:06                   ` Niccolò Belli
2012-09-16 10:18                     ` Robin Hill
2012-09-16 10:42                   ` Niccolò Belli
2012-09-16 15:26                     ` Chris Murphy
2012-09-16 15:31                       ` Niccolò Belli
2012-09-16 23:35                         ` Niccolò Belli
2012-09-17  0:00                           ` Chris Murphy
2012-09-17  0:03                             ` Niccolò Belli
2012-09-14  8:13       ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).