Recovering from two almost simultaneously failed devices in RAID1

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Recovering from two almost simultaneously failed devices in RAID1
@ 2013-08-10 16:29 Carsten Aulbert
  2013-08-10 16:33 ` Carsten Aulbert
  0 siblings, 1 reply; 5+ messages in thread
From: Carsten Aulbert @ 2013-08-10 16:29 UTC (permalink / raw)
  To: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 11332 bytes --]

Hi there

I fear one of our mainboards did not play nicely with our SSDs in RAID1
configuration:
mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Fri Jul 27 11:58:50 2012
     Raid Level : raid1
     Array Size : 250050533 (238.47 GiB 256.05 GB)
  Used Dev Size : 250050533 (238.47 GiB 256.05 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sat Aug 10 14:58:30 2013
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       0        0        1      removed

       1       8       33        -      faulty spare   /dev/sdc1


It seems both drives experienced some problem at around the same time,
sdc was taken offline first, but then sdd also had problems (see log at
the end of the email).

The filesystem on top of it (ext4) of course had no way of coping with
this problem, other than going to read/only.

The big questions of course are

(a) how to retrieve as much data as possible from the disks
(b) how to revive the raid system again

Any thoughts of what I should try first?

I think to tackle (a) I'll use ddrescue first, just trying to cover a
possible mistake I make later on

Cheers

Carsten


Here's the start of the log:

Aug 10 14:57:30 gitmaster kernel: [10731321.352291] ata3.00: exception
Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Aug 10 14:57:30 gitmaster kernel: [10731321.352350] ata3.00: failed
command: WRITE FPDMA QUEUED
Aug 10 14:57:30 gitmaster kernel: [10731321.352380] ata3.00: cmd
61/02:00:47:00:00/00:00:00:00:00/40 tag 0 ncq 1024 out
Aug 10 14:57:30 gitmaster kernel: [10731321.352380]          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Aug 10 14:57:30 gitmaster kernel: [10731321.352469] ata3.00: status: {
DRDY }
Aug 10 14:57:30 gitmaster kernel: [10731321.352495] ata3: hard resetting
link
Aug 10 14:57:30 gitmaster kernel: [10731321.352528] ata4.00: exception
Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Aug 10 14:57:30 gitmaster kernel: [10731321.352574] ata4.00: failed
command: WRITE FPDMA QUEUED
Aug 10 14:57:30 gitmaster kernel: [10731321.352604] ata4.00: cmd
61/02:00:47:00:00/00:00:00:00:00/40 tag 0 ncq 1024 out
Aug 10 14:57:30 gitmaster kernel: [10731321.352605]          res
40/00:00:47:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Aug 10 14:57:30 gitmaster kernel: [10731321.352695] ata4.00: status: {
DRDY }
Aug 10 14:57:30 gitmaster kernel: [10731321.352721] ata4: hard resetting
link
Aug 10 14:57:35 gitmaster kernel: [10731326.709171] ata3: link is slow
to respond, please be patient (ready=0)
Aug 10 14:57:35 gitmaster kernel: [10731326.721137] ata4: link is slow
to respond, please be patient (ready=0)
Aug 10 14:57:40 gitmaster kernel: [10731331.354487] ata3: COMRESET
failed (errno=-16)
Aug 10 14:57:40 gitmaster kernel: [10731331.354518] ata3: hard resetting
link
Aug 10 14:57:40 gitmaster kernel: [10731331.370448] ata4: COMRESET
failed (errno=-16)
Aug 10 14:57:40 gitmaster kernel: [10731331.370480] ata4: hard resetting
link
Aug 10 14:57:45 gitmaster kernel: [10731336.715383] ata3: link is slow
to respond, please be patient (ready=0)
Aug 10 14:57:45 gitmaster kernel: [10731336.735346] ata4: link is slow
to respond, please be patient (ready=0)
Aug 10 14:57:50 gitmaster kernel: [10731341.360692] ata3: COMRESET
failed (errno=-16)
Aug 10 14:57:50 gitmaster kernel: [10731341.360723] ata3: hard resetting
link
Aug 10 14:57:50 gitmaster kernel: [10731341.388654] ata4: COMRESET
failed (errno=-16)
Aug 10 14:57:50 gitmaster kernel: [10731341.388686] ata4: hard resetting
link
Aug 10 14:57:55 gitmaster kernel: [10731346.721587] ata3: link is slow
to respond, please be patient (ready=0)
Aug 10 14:57:55 gitmaster kernel: [10731346.749571] ata4: link is slow
to respond, please be patient (ready=0)
Aug 10 14:58:01 gitmaster /USR/SBIN/CRON[10885]: (root) CMD (cd
/srv/gitorious && rake ultrasphinx:index RAILS_ENV=production >
/dev/null 2>&1)
Aug 10 14:58:25 gitmaster kernel: [10731376.344429] ata3: COMRESET
failed (errno=-16)
Aug 10 14:58:25 gitmaster kernel: [10731376.344464] ata3: limiting SATA
link speed to 1.5 Gbps
Aug 10 14:58:25 gitmaster kernel: [10731376.344497] ata3: hard resetting
link
Aug 10 14:58:25 gitmaster kernel: [10731376.424371] ata4: COMRESET
failed (errno=-16)
Aug 10 14:58:25 gitmaster kernel: [10731376.424403] ata4: limiting SATA
link speed to 1.5 Gbps
Aug 10 14:58:25 gitmaster kernel: [10731376.424436] ata4: hard resetting
link
Aug 10 14:58:30 gitmaster kernel: [10731381.365521] ata3: COMRESET
failed (errno=-16)
Aug 10 14:58:30 gitmaster kernel: [10731381.365554] ata3: reset failed,
giving up
Aug 10 14:58:30 gitmaster kernel: [10731381.365585] ata3.00: disabled
Aug 10 14:58:30 gitmaster kernel: [10731381.365610] ata3.00: device
reported invalid CHS sector 0
Aug 10 14:58:30 gitmaster kernel: [10731381.365643] ata3: EH complete
Aug 10 14:58:30 gitmaster kernel: [10731381.365675] sd 2:0:0:0: [sdc]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.365701] sd 2:0:0:0: [sdc]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.365748] sd 2:0:0:0: [sdc]
CDB: Write(10): 2a 00 00 00 00 47 00 00 02 00
Aug 10 14:58:30 gitmaster kernel: [10731381.365816] end_request: I/O
error, dev sdc, sector 71
Aug 10 14:58:30 gitmaster kernel: [10731381.365844] end_request: I/O
error, dev sdc, sector 71
Aug 10 14:58:30 gitmaster kernel: [10731381.365871] md: super_written
gets error=-5, uptodate=0
Aug 10 14:58:30 gitmaster kernel: [10731381.365900] md/raid1:md2: Disk
failure on sdc1, disabling device.
Aug 10 14:58:30 gitmaster kernel: [10731381.365900] md/raid1:md2:
Operation continuing on 1 devices.
Aug 10 14:58:30 gitmaster kernel: [10731381.453474] ata4: COMRESET
failed (errno=-16)
Aug 10 14:58:30 gitmaster kernel: [10731381.453505] ata4: reset failed,
giving up
Aug 10 14:58:30 gitmaster kernel: [10731381.453536] ata4.00: disabled
Aug 10 14:58:30 gitmaster kernel: [10731381.453565] ata4: EH complete
Aug 10 14:58:30 gitmaster kernel: [10731381.453596] sd 3:0:0:0: [sdd]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.453621] sd 3:0:0:0: [sdd]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.453669] sd 3:0:0:0: [sdd]
CDB: Write(10): 2a 00 00 00 00 47 00 00 02 00
Aug 10 14:58:30 gitmaster kernel: [10731381.453737] end_request: I/O
error, dev sdd, sector 71
Aug 10 14:58:30 gitmaster kernel: [10731381.453765] end_request: I/O
error, dev sdd, sector 71
Aug 10 14:58:30 gitmaster kernel: [10731381.453792] md: super_written
gets error=-5, uptodate=0
Aug 10 14:58:30 gitmaster kernel: [10731381.453867] sd 3:0:0:0: [sdd]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.453894] sd 3:0:0:0: [sdd]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.453941] sd 3:0:0:0: [sdd]
CDB: Write(10): 2a 00 00 00 00 47 00 00 02 00
Aug 10 14:58:30 gitmaster kernel: [10731381.454010] end_request: I/O
error, dev sdd, sector 71
Aug 10 14:58:30 gitmaster kernel: [10731381.454036] end_request: I/O
error, dev sdd, sector 71
Aug 10 14:58:30 gitmaster kernel: [10731381.454064] md: super_written
gets error=-5, uptodate=0
Aug 10 14:58:30 gitmaster kernel: [10731381.454136] RAID1 conf printout:
Aug 10 14:58:30 gitmaster kernel: [10731381.454140]  --- wd:1 rd:2
Aug 10 14:58:30 gitmaster kernel: [10731381.454143]  disk 0, wo:0, o:1,
dev:sdd1
Aug 10 14:58:30 gitmaster kernel: [10731381.454146]  disk 1, wo:1, o:0,
dev:sdc1
Aug 10 14:58:30 gitmaster kernel: [10731381.477438] RAID1 conf printout:
Aug 10 14:58:30 gitmaster kernel: [10731381.477442]  --- wd:1 rd:2
Aug 10 14:58:30 gitmaster kernel: [10731381.477446]  disk 0, wo:0, o:1,
dev:sdd1
Aug 10 14:58:30 gitmaster kernel: [10731381.477477] sd 3:0:0:0: [sdd]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.477514] sd 3:0:0:0: [sdd]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.477562] sd 3:0:0:0: [sdd]
CDB: Write(10): 2a 00 0e c7 da 6f 00 00 18 00
Aug 10 14:58:30 gitmaster kernel: [10731381.477630] end_request: I/O
error, dev sdd, sector 247978607
Aug 10 14:58:30 gitmaster kernel: [10731381.477728] Aborting journal on
device md2-8.
Aug 10 14:58:30 gitmaster kernel: [10731381.477774] sd 3:0:0:0: [sdd]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.477802] sd 3:0:0:0: [sdd]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.477851] sd 3:0:0:0: [sdd]
CDB: Write(10): 2a 00 0e c4 08 3f 00 00 08 00
Aug 10 14:58:30 gitmaster kernel: [10731381.477922] end_request: I/O
error, dev sdd, sector 247728191
Aug 10 14:58:30 gitmaster kernel: [10731381.477944] sd 3:0:0:0: [sdd]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.477945] sd 3:0:0:0: [sdd]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.477947] sd 3:0:0:0: [sdd]
CDB: Write(10): 2a 00 00 00 08 3f 00 00 08 00
Aug 10 14:58:30 gitmaster kernel: [10731381.477950] end_request: I/O
error, dev sdd, sector 2111
Aug 10 14:58:30 gitmaster kernel: [10731381.477982] Buffer I/O error on
device md2, logical block 0
Aug 10 14:58:30 gitmaster kernel: [10731381.477983] lost page write due
to I/O error on md2
Aug 10 14:58:30 gitmaster kernel: [10731381.478011] EXT4-fs error
(device md2): ext4_journal_start_sb:327: Detected aborted journal
Aug 10 14:58:30 gitmaster kernel: [10731381.478013] EXT4-fs (md2):
Remounting filesystem read-only
Aug 10 14:58:30 gitmaster kernel: [10731381.478014] EXT4-fs (md2):
previous I/O error to superblock detected
Aug 10 14:58:30 gitmaster kernel: [10731381.478052] sd 3:0:0:0: [sdd]
Unhandled error code
Aug 10 14:58:30 gitmaster kernel: [10731381.478054] sd 3:0:0:0: [sdd]
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 10 14:58:30 gitmaster kernel: [10731381.478055] sd 3:0:0:0: [sdd]
CDB: Write(10): 2a 00 00 00 08 3f 00 00 08 00
Aug 10 14:58:30 gitmaster kernel: [10731381.478059] end_request: I/O
error, dev sdd, sector 2111
Aug 10 14:58:30 gitmaster kernel: [10731381.478078] Buffer I/O error on
device md2, logical block 0
Aug 10 14:58:30 gitmaster kernel: [10731381.478079] lost page write due
to I/O error on md2
Aug 10 14:58:30 gitmaster kernel: [10731381.485182] Buffer I/O error on
device md2, logical block 30965760
Aug 10 14:58:30 gitmaster kernel: [10731381.485184] lost page write due
to I/O error on md2
Aug 10 14:58:30 gitmaster kernel: [10731381.485190] JBD2: I/O error
detected when updating journal superblock for md2-8.
Aug 10 14:58:30 gitmaster mdadm[1470]: Fail event detected on md device
/dev/md/2, component device /dev/sdc1



-- 
Dr. Carsten Aulbert - Max Planck Institute for Gravitational Physics
Callinstrasse 38, 30167 Hannover, Germany
phone/fax: +49 511 762-17185 / -17193
https://wiki.atlas.aei.uni-hannover.de/foswiki/bin/view/ATLAS/WebHome


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2044 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recovering from two almost simultaneously failed devices in RAID1
  2013-08-10 16:29 Recovering from two almost simultaneously failed devices in RAID1 Carsten Aulbert
@ 2013-08-10 16:33 ` Carsten Aulbert
  2013-08-10 17:39   ` Carsten Aulbert
  0 siblings, 1 reply; 5+ messages in thread
From: Carsten Aulbert @ 2013-08-10 16:33 UTC (permalink / raw)
  To: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 183 bytes --]

Addendum:

currently I cannot access either of the two devices, i.e. I fear I need
to reboot the machine - possibly power cycle - to regain access to them.

Cheers

Carsten


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2044 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recovering from two almost simultaneously failed devices in RAID1
  2013-08-10 16:33 ` Carsten Aulbert
@ 2013-08-10 17:39   ` Carsten Aulbert
  2013-08-10 17:45     ` Mathias Burén
  0 siblings, 1 reply; 5+ messages in thread
From: Carsten Aulbert @ 2013-08-10 17:39 UTC (permalink / raw)
  To: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1253 bytes --]

*sigh*

Thanks to me being at home and only IPMI access with relatively high
latencies, I did not manage to boot into single user mode and the system
booted up normally - luckily for me, both md and ext4 tell me, the file
system is safe and ok.


# mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Fri Jul 27 11:58:50 2012
     Raid Level : raid1
     Array Size : 250050533 (238.47 GiB 256.05 GB)
  Used Dev Size : 250050533 (238.47 GiB 256.05 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sat Aug 10 19:38:01 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : gitmaster:2  (local to host gitmaster)
           UUID : 7cb262a7:54496605:238cc0bf:6fa6a3e9
         Events : 61

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       8       33        1      active sync   /dev/sdc1


I think I got really lucky (still running checks on the git repositories
on this md).

Anyone who experienced such a problem with SSDs (or HDDs) being
disconnected like this?

Cheers

Carsten


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2044 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recovering from two almost simultaneously failed devices in RAID1
  2013-08-10 17:39   ` Carsten Aulbert
@ 2013-08-10 17:45     ` Mathias Burén
  2013-08-10 18:05       ` Carsten Aulbert
  0 siblings, 1 reply; 5+ messages in thread
From: Mathias Burén @ 2013-08-10 17:45 UTC (permalink / raw)
  To: Carsten Aulbert; +Cc: Linux RAID

What's the smartctl -a output for both drives? You might have bad SATA
cables. Also smartctl -t long (takes 1 minute) might be worth it. Not
sure how useful that is on SSDs.

Mathias

On 10 August 2013 18:39, Carsten Aulbert <Carsten.Aulbert@aei.mpg.de> wrote:
> *sigh*
>
> Thanks to me being at home and only IPMI access with relatively high
> latencies, I did not manage to boot into single user mode and the system
> booted up normally - luckily for me, both md and ext4 tell me, the file
> system is safe and ok.
>
>
> # mdadm --detail /dev/md2
> /dev/md2:
>         Version : 1.2
>   Creation Time : Fri Jul 27 11:58:50 2012
>      Raid Level : raid1
>      Array Size : 250050533 (238.47 GiB 256.05 GB)
>   Used Dev Size : 250050533 (238.47 GiB 256.05 GB)
>    Raid Devices : 2
>   Total Devices : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Sat Aug 10 19:38:01 2013
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            Name : gitmaster:2  (local to host gitmaster)
>            UUID : 7cb262a7:54496605:238cc0bf:6fa6a3e9
>          Events : 61
>
>     Number   Major   Minor   RaidDevice State
>        0       8       49        0      active sync   /dev/sdd1
>        1       8       33        1      active sync   /dev/sdc1
>
>
> I think I got really lucky (still running checks on the git repositories
> on this md).
>
> Anyone who experienced such a problem with SSDs (or HDDs) being
> disconnected like this?
>
> Cheers
>
> Carsten
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recovering from two almost simultaneously failed devices in RAID1
  2013-08-10 17:45     ` Mathias Burén
@ 2013-08-10 18:05       ` Carsten Aulbert
  0 siblings, 0 replies; 5+ messages in thread
From: Carsten Aulbert @ 2013-08-10 18:05 UTC (permalink / raw)
  To: Mathias Burén; +Cc: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 4571 bytes --]

Hi

On 08/10/2013 07:45 PM, Mathias Burén wrote:
> smartctl -a 

Looks pretty much innocent:

  1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always
      -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail
Offline      -       0
  3 Spin_Up_Time            0x0007   100   100   050    Pre-fail  Always
      -       0
  5 Reallocated_Sector_Ct   0x0013   100   100   050    Pre-fail  Always
      -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always
      -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail
Offline      -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always
      -       9098
 10 Spin_Retry_Count        0x0013   100   100   050    Pre-fail  Always
      -       0
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always
      -       13
167 Unknown_Attribute       0x0022   100   100   000    Old_age   Always
      -       0
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always
      -       2
169 Unknown_Attribute       0x0013   092   092   010    Pre-fail  Always
      -       0
173 Unknown_Attribute       0x0012   169   169   000    Old_age   Always
      -       0
175 Program_Fail_Count_Chip 0x0013   100   100   010    Pre-fail  Always
      -       0
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always
      -       0
194 Temperature_Celsius     0x0023   073   073   030    Pre-fail  Always
      -       27 (Lifetime Min/Max 26/40)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
      -       0
240 Head_Flying_Hours       0x0013   100   100   050    Pre-fail  Always
      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      9098
     -


and

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always
      -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail
Offline      -       0
  3 Spin_Up_Time            0x0007   100   100   050    Pre-fail  Always
      -       0
  5 Reallocated_Sector_Ct   0x0013   100   100   050    Pre-fail  Always
      -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always
      -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail
Offline      -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always
      -       9098
 10 Spin_Retry_Count        0x0013   100   100   050    Pre-fail  Always
      -       0
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always
      -       12
167 Unknown_Attribute       0x0022   100   100   000    Old_age   Always
      -       0
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always
      -       2
169 Unknown_Attribute       0x0013   095   095   010    Pre-fail  Always
      -       0
173 Unknown_Attribute       0x0012   169   169   000    Old_age   Always
      -       0
175 Program_Fail_Count_Chip 0x0013   100   100   010    Pre-fail  Always
      -       0
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always
      -       0
194 Temperature_Celsius     0x0023   070   070   030    Pre-fail  Always
      -       30 (Lifetime Min/Max 29/40)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
      -       0
240 Head_Flying_Hours       0x0013   100   100   050    Pre-fail  Always
      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      9098
     -


I think I need to check the BIOS settings and reseat all internal
connections on Monday, when I'm back in office. Also, I will contact
Supermicro if they know more about this (and perhaps a new BIOS).

More suggestions?

Cheers

carsten


-- 
Dr. Carsten Aulbert - Max Planck Institute for Gravitational Physics
Callinstrasse 38, 30167 Hannover, Germany
phone/fax: +49 511 762-17185 / -17193
https://wiki.atlas.aei.uni-hannover.de/foswiki/bin/view/ATLAS/WebHome


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2044 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-08-10 18:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-10 16:29 Recovering from two almost simultaneously failed devices in RAID1 Carsten Aulbert
2013-08-10 16:33 ` Carsten Aulbert
2013-08-10 17:39   ` Carsten Aulbert
2013-08-10 17:45     ` Mathias Burén
2013-08-10 18:05       ` Carsten Aulbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox