linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problem with 5disk RAID5 array - two drives lost
@ 2006-04-19 19:39 Tim Bostrom
  2006-04-22  3:54 ` Molle Bestefich
  0 siblings, 1 reply; 10+ messages in thread
From: Tim Bostrom @ 2006-04-19 19:39 UTC (permalink / raw)
  To: linux-raid

Good day,

I'm running FC4 kernel 2.6.11-1.1369 with a 5 disk RAID5 array.    
This past weekend after a reboot to my machine, /dev/md0 will no  
longer mount and Fedora will abort booting the system and force me to  
fix the filesystem.  Upon further investigation, it looks like I lost  
two drives within a few weeks of each other.  I'll go ahead and get  
this out of the way - I'm an idiot and didn't setup mdadm -F for  
mailing with RAID problems.

It appears that /dev/hdf1 failed this past week and /dev/hdh1 failed  
back in February.  I tried a mdadm --assemble --force and was able to  
get the following:

==========================
mdadm: forcing event count in /dev/hdf1(1) from 777532 upto 777535
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/hdf1
raid5: raid level 5 set md0 active with 4 out of 5 devices, algorithm 2
mdadm: /dev/md0 has been started with 4 drives (out of 5).
==========================


I then tried to mount /dev/md0 and received the following:
====================
raid5: Disk failure on hdf1, disabling device.  Operation continuing  
on  drives
mount: wrong fs type, bad option, bad superblock on /dev/md0,
missing codepage or other error
In some cases useful info is found in syslog - try dmesg | tail
=====================

In checking dmesg, I find:
==================================
raid5: device hde1 operational as raid disk 0
raid5: device hdc1 operational as raid disk 4
raid5: device hdg1 operational as raid disk 2
raid5: device hdf1 operational as raid disk 1
raid5: allocated 5254kB for md0
raid5: raid level 5 set md0 active with 4 out of 5 devices, algorithm 2
RAID5 conf printout:
--- rd:5 wd:4 fd:1
disk 0, o:1, dev:hde1
disk 1, o:1, dev:hdf1
disk 2, o:1, dev:hdg1
disk 4, o:1, dev:hdc1
usb 1-2: USB disconnect, address 2
usb 1-2: new full speed USB device using uhci_hcd and address 3
usb 1-2: not running at top speed; connect to a high speed hub
scsi1 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 3
usb-storage: waiting for device to settle before scanning
   Vendor: SanDisk   Model: Cruzer Mini       Rev: 0.1
   Type:   Direct-Access                      ANSI SCSI revision: 02
SCSI device sda: 1000944 512-byte hdwr sectors (512 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
SCSI device sda: 1000944 512-byte hdwr sectors (512 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
sda: sda1
Attached scsi removable disk sda at scsi1, channel 0, id 0, lun 0
usb-storage: device scan complete
spurious 8259A interrupt: IRQ7.
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6720,  
high=0, low=6720, sector=6719
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6719
raid5: Disk failure on hdf1, disabling device. Operation continuing  
on 3 devices
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6731,  
high=0, low=6731, sector=6727
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6727
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6735,  
high=0, low=6735, sector=6735
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6735
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6743,  
high=0, low=6743, sector=6743
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6743
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6753,  
high=0, low=6753, sector=6751
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6751
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6763,  
high=0, low=6763, sector=6759
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6759
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6770,  
high=0, low=6770, sector=6767
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6767
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6776,  
high=0, low=6776, sector=6775
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6775
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6803,  
high=0, low=6803, sector=6783
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6783
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6803,  
high=0, low=6803, sector=6791
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6791
JBD: Failed to read block at offset 1794
JBD: recovery failed
EXT3-fs: error loading journal.
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6803,  
high=0, low=6803, sector=6799
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6799
Buffer I/O error on device md0, logical block 1604
lost page write due to I/O error on md0
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6807,  
high=0, low=6807, sector=6807
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6807
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6815,  
high=0, low=6815, sector=6815
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6815
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6823,  
high=0, low=6823, sector=6823
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6823
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6831,  
high=0, low=6831, sector=6831
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6831
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6841,  
high=0, low=6841, sector=6839
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6839
hdf: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdf: dma_intr: error=0x40 { UncorrectableError }, LBAsect=6851,  
high=0, low=6851, sector=6847
ide: failed opcode was: unknown
end_request: I/O error, dev hdf, sector 6847
RAID5 conf printout:
--- rd:5 wd:3 fd:2
disk 0, o:1, dev:hde1
disk 1, o:0, dev:hdf1
disk 2, o:1, dev:hdg1
disk 4, o:1, dev:hdc1
RAID5 conf printout:
--- rd:5 wd:3 fd:2
disk 0, o:1, dev:hde1
disk 2, o:1, dev:hdg1
disk 4, o:1, dev:hdc1
================================

I'm guessing /dev/hdf is shot.  I haven't tried an fsck though.   
Would this be advisable?  I don't want to bork all the data.  It's  
about 700 GB of data.  I'm open to losing any data that was added  
since the February drive failure.  Is there a way that I can try and  
build the array again with /dev/hdh instead of /dev/hdf with some  
possible data corruption on files that were added since Feb?

Any advice would great.  I'm at a loss and I don't want to lose all  
of the data if I don't have to.  I might end up visiting one of those  
data recovery shops if I can't fix this on my own.

Thank you,

Tim



mdadm -E outputs below:
=================================

/dev/hdc1:
           Magic : a92b4efc
         Version : 00.90.01
            UUID : 2d1d58c2:23357cca:12b8e65a:a80cdebe
   Creation Time : Tue Jul 26 17:20:10 2005
      Raid Level : raid5
    Raid Devices : 5
   Total Devices : 4
Preferred Minor : 0

     Update Time : Sun Apr 16 09:10:28 2006
           State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 3
   Spare Devices : 0
        Checksum : 4a150769 - correct
          Events : 0.777535

          Layout : left-symmetric
      Chunk Size : 128K

       Number   Major   Minor   RaidDevice State
this     4      22        1        4      active sync   /dev/hdc1

    0     0      33        1        0      active sync   /dev/hde1
    1     1       0        0        1      faulty removed
    2     2      34        1        2      active sync   /dev/hdg1
    3     3       0        0        3      faulty removed
    4     4      22        1        4      active sync   /dev/hdc1



/dev/hde1:
           Magic : a92b4efc
         Version : 00.90.01
            UUID : 2d1d58c2:23357cca:12b8e65a:a80cdebe
   Creation Time : Tue Jul 26 17:20:10 2005
      Raid Level : raid5
    Raid Devices : 5
   Total Devices : 4
Preferred Minor : 0

     Update Time : Sun Apr 16 09:10:28 2006
           State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 3
   Spare Devices : 0
        Checksum : 4a15076c - correct
          Events : 0.777535

          Layout : left-symmetric
      Chunk Size : 128K

       Number   Major   Minor   RaidDevice State
this     0      33        1        0      active sync   /dev/hde1

    0     0      33        1        0      active sync   /dev/hde1
    1     1       0        0        1      faulty removed
    2     2      34        1        2      active sync   /dev/hdg1
    3     3       0        0        3      faulty removed
    4     4      22        1        4      active sync   /dev/hdc1




/dev/hdf1:
           Magic : a92b4efc
         Version : 00.90.01
            UUID : 2d1d58c2:23357cca:12b8e65a:a80cdebe
   Creation Time : Tue Jul 26 17:20:10 2005
      Raid Level : raid5
    Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

     Update Time : Fri Apr 14 13:46:06 2006
           State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 2
   Spare Devices : 0
        Checksum : 4a06c868 - correct
          Events : 0.777532

          Layout : left-symmetric
      Chunk Size : 128K

       Number   Major   Minor   RaidDevice State
this     1      33       65        1      active sync   /dev/hdf1

    0     0      33        1        0      active sync   /dev/hde1
    1     1      33       65        1      active sync   /dev/hdf1
    2     2      34        1        2      active sync   /dev/hdg1
    3     3       0        0        3      faulty removed
    4     4      22        1        4      active sync   /dev/hdc1

/dev/hdh1:
           Magic : a92b4efc
         Version : 00.90.01
            UUID : 2d1d58c2:23357cca:12b8e65a:a80cdebe
   Creation Time : Tue Jul 26 17:20:10 2005
      Raid Level : raid5
    Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

     Update Time : Tue Feb 21 07:47:51 2006
           State : active
Active Devices : 5
Working Devices : 5
Failed Devices : 0
   Spare Devices : 0
        Checksum : 49c0be2c - correct
          Events : 0.698097

          Layout : left-symmetric
      Chunk Size : 128K

       Number   Major   Minor   RaidDevice State
this     3      34       65        3      active sync   /dev/hdh1

    0     0      33        1        0      active sync   /dev/hde1
    1     1      33       65        1      active sync   /dev/hdf1
    2     2      34        1        2      active sync   /dev/hdg1
    3     3      34       65        3      active sync   /dev/hdh1
    4     4      22        1        4      active sync   /dev/hdc1



/dev/hdh1:
           Magic : a92b4efc
         Version : 00.90.01
            UUID : 2d1d58c2:23357cca:12b8e65a:a80cdebe
   Creation Time : Tue Jul 26 17:20:10 2005
      Raid Level : raid5
    Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

     Update Time : Tue Feb 21 07:47:51 2006
           State : active
Active Devices : 5
Working Devices : 5
Failed Devices : 0
   Spare Devices : 0
        Checksum : 49c0be2c - correct
          Events : 0.698097

          Layout : left-symmetric
      Chunk Size : 128K

       Number   Major   Minor   RaidDevice State
this     3      34       65        3      active sync   /dev/hdh1

    0     0      33        1        0      active sync   /dev/hde1
    1     1      33       65        1      active sync   /dev/hdf1
    2     2      34        1        2      active sync   /dev/hdg1
    3     3      34       65        3      active sync   /dev/hdh1
    4     4      22        1        4      active sync   /dev/hdc1


/dev/hdg1:
           Magic : a92b4efc
         Version : 00.90.01
            UUID : 2d1d58c2:23357cca:12b8e65a:a80cdebe
   Creation Time : Tue Jul 26 17:20:10 2005
      Raid Level : raid5
    Raid Devices : 5
   Total Devices : 4
Preferred Minor : 0

     Update Time : Sun Apr 16 09:10:28 2006
           State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 3
   Spare Devices : 0
        Checksum : 4a150771 - correct
          Events : 0.777535

          Layout : left-symmetric
      Chunk Size : 128K

       Number   Major   Minor   RaidDevice State
this     2      34        1        2      active sync   /dev/hdg1

    0     0      33        1        0      active sync   /dev/hde1
    1     1       0        0        1      faulty removed
    2     2      34        1        2      active sync   /dev/hdg1
    3     3       0        0        3      faulty removed
    4     4      22        1        4      active sync   /dev/hdc1





^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-04-26  6:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-19 19:39 Problem with 5disk RAID5 array - two drives lost Tim Bostrom
2006-04-22  3:54 ` Molle Bestefich
2006-04-22 19:42   ` Carlos Carvalho
2006-04-22 19:52     ` Molle Bestefich
2006-04-22 19:54     ` David Greaves
2006-04-24  0:17   ` Tim Bostrom
2006-04-24  2:00     ` Arthur Britto
2006-04-24 14:01       ` David Greaves
2006-04-25 14:55         ` Tim Bostrom
2006-04-26  6:19           ` Tim Bostrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).