linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bernhard Dobbels <Bernhard@Dobbels.com>
To: linux-raid@vger.kernel.org
Subject: raid5+ lvm2 disaster
Date: Fri, 09 Jul 2004 22:16:56 +0200	[thread overview]
Message-ID: <40EEFD38.8080805@dobbels.com> (raw)

Hi,

Short history: configured raid5 + lvm2 for data disks. Everything worked 
fine. When converting root (system disk) to raid 1, I've lost my system 
disks. I did a reinstall (1st time in 4 years) of Debian and compiled a 
new kernel 2.6.6.

Now trying to recover my raid 5 + lvm. When raid 5 was up (in degraded 
mode), I could see all my lv's, so I think all data is still ok.

I had problems with DMA timeout and with the patch mentioned in 
http://kerneltrap.org/node/view/3040 for pDC20268, which had the same 
erors in messages.
I've checked the raid with lsraid and two disks seemed ok, although, one 
was mentioned as spare.
I did a mkraid --really-force /dev/md0 to remake the raid, but after 
this, I cannot start it anymore.

Any help or tips to recover all or part of data would be welcome 
(ofcourse no backup ;-), as data was not that important), but the wife 
still wants to see a Friends a day, which she can't do now ;(.

most commands + output:

tail /var/log/messages:

Jul  9 14:00:43 localhost kernel: hde: dma_timer_expiry: dma status == 0x61
Jul  9 14:00:53 localhost kernel: hde: DMA timeout error
Jul  9 14:00:53 localhost kernel: hde: dma timeout error: status=0x51 { 
DriveReady SeekComplete Error }
Jul  9 14:00:53 localhost kernel: hde: dma timeout error: error=0x40 { 
UncorrectableError }, LBAsect=118747579, high=7, low=1307067, 
sector=118747455
Jul  9 14:00:53 localhost kernel: end_request: I/O error, dev hde, 
sector 118747455
Jul  9 14:00:53 localhost kernel: md: md0: sync done.
Jul  9 14:00:53 localhost kernel: RAID5 conf printout:
Jul  9 14:00:53 localhost kernel:  --- rd:3 wd:1 fd:2
Jul  9 14:00:53 localhost kernel:  disk 0, o:1, dev:hdc1
Jul  9 14:00:53 localhost kernel:  disk 1, o:0, dev:hde1
Jul  9 14:00:53 localhost kernel:  disk 2, o:1, dev:hdg1
Jul  9 14:00:53 localhost kernel: RAID5 conf printout:
Jul  9 14:00:53 localhost kernel:  --- rd:3 wd:1 fd:2
Jul  9 14:00:53 localhost kernel:  disk 0, o:1, dev:hdc1
Jul  9 14:00:53 localhost kernel:  disk 2, o:1, dev:hdg1
Jul  9 14:00:53 localhost kernel: md: syncing RAID array md0
Jul  9 14:00:53 localhost kernel: md: minimum _guaranteed_ 
reconstruction speed: 1000 KB/sec/disc.
Jul  9 14:00:53 localhost kernel: md: using maximum available idle IO 
bandwith (but not more than 200000 KB/sec) for reconstruction.
Jul  9 14:00:53 localhost kernel: md: using 128k window, over a total of 
195358336 blocks.
Jul  9 14:00:53 localhost kernel: md: md0: sync done.
Jul  9 14:00:53 localhost kernel: md: syncing RAID array md0
Jul  9 14:00:53 localhost kernel: md: minimum _guaranteed_ 
reconstruction speed: 1000 KB/sec/disc.
Jul  9 14:00:53 localhost kernel: md: using maximum available idle IO 
bandwith (but not more than 200000 KB/sec) for reconstruction.
Jul  9 14:00:53 localhost kernel: md: using 128k window, over a total of 
195358336 blocks.
Jul  9 14:00:53 localhost kernel: md: md0: sync done.

+ many times (per second) the same repeated.



viking:/home/bernhard# lsraid -a /dev/md0 -d /dev/hdc1 -d /dev/hde1 -d 
/dev/hdg1
[dev   9,   0] /dev/md0         829542B9.3737417C.D102FD21.18FFE273 offline
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
[dev  34,   1] /dev/hdg1        829542B9.3737417C.D102FD21.18FFE273 good
[dev  33,   1] /dev/hde1        829542B9.3737417C.D102FD21.18FFE273 failed
[dev  22,   1] /dev/hdc1        829542B9.3737417C.D102FD21.18FFE273 spare


viking:/home/bernhard# lsraid -a /dev/md0 -d /dev/hdc1 -d /dev/hde1 -d 
/dev/hdg1 -D
[dev 22, 1] /dev/hdc1:
         md device       = [dev 9, 0] /dev/md0
         md uuid         = 829542B9.3737417C.D102FD21.18FFE273
         state           = spare

[dev 34, 1] /dev/hdg1:
         md device       = [dev 9, 0] /dev/md0
         md uuid         = 829542B9.3737417C.D102FD21.18FFE273
         state           = good

[dev 33, 1] /dev/hde1:
         md device       = [dev 9, 0] /dev/md0
         md uuid         = 829542B9.3737417C.D102FD21.18FFE273
         state           = failed

viking:/home/bernhard# lsraid -R -a /dev/md0 -d /dev/hdc1 -d /dev/hde1 
-d /dev/hdg1
# This raidtab was generated by lsraid version 0.7.0.
# It was created from a query on the following devices:
#       /dev/md0
#       /dev/hdc1
#       /dev/hde1
#       /dev/hdg1

# md device [dev 9, 0] /dev/md0 queried offline
# Authoritative device is [dev 22, 1] /dev/hdc1
raiddev /dev/md0
         raid-level              5
         nr-raid-disks           3
         nr-spare-disks          1
         persistent-superblock   1
         chunk-size              32

         device          /dev/hdg1
         raid-disk               2
         device          /dev/hdc1
         spare-disk              0
         device          /dev/null
         failed-disk             0
         device          /dev/null
         failed-disk             1




viking:/home/bernhard# lsraid -R -p
# This raidtab was generated by lsraid version 0.7.0.
# It was created from a query on the following devices:
#       /dev/hda
#       /dev/hda1
#       /dev/hda2
#       /dev/hda5
#       /dev/hdb
#       /dev/hdb1
#       /dev/hdc
#       /dev/hdc1
#       /dev/hdd
#       /dev/hdd1
#       /dev/hde
#       /dev/hde1
#       /dev/hdf
#       /dev/hdf1
#       /dev/hdg
#       /dev/hdg1
#       /dev/hdh
#       /dev/hdh1

# md device [dev 9, 0] /dev/md0 queried offline
# Authoritative device is [dev 22, 1] /dev/hdc1
raiddev /dev/md0
         raid-level              5
         nr-raid-disks           3
         nr-spare-disks          1
         persistent-superblock   1
         chunk-size              32

         device          /dev/hdg1
         raid-disk               2
         device          /dev/hdc1
         spare-disk              0
         device          /dev/null
         failed-disk             0
         device          /dev/null
         failed-disk             1

viking:/home/bernhard# cat /etc/raidtab
raiddev /dev/md0
         raid-level      5
         nr-raid-disks   3
         nr-spare-disks  0
         persistent-superblock   1
         parity-algorithm        left-symmetric

         device  /dev/hdc1
         raid-disk 0
         device  /dev/hde1
         failed-disk 1
         device  /dev/hdg1
         raid-disk 2


viking:/home/bernhard# mkraid --really-force /dev/md0
DESTROYING the contents of /dev/md0 in 5 seconds, Ctrl-C if unsure!
handling MD device /dev/md0
analyzing super-block
disk 0: /dev/hdc1, 195358401kB, raid superblock at 195358336kB
disk 1: /dev/hde1, failed
disk 2: /dev/hdg1, 195358401kB, raid superblock at 195358336kB
/dev/md0: Invalid argument

viking:/home/bernhard# raidstart /dev/md0
/dev/md0: Invalid argument


viking:/home/bernhard# cat /proc/mdstat
Personalities : [raid1] [raid5]
md0 : inactive hdg1[2] hdc1[0]
       390716672 blocks
unused devices: <none>
viking:/home/bernhard# pvscan -v
     Wiping cache of LVM-capable devices
     Wiping internal cache
     Walking through all physical volumes
   Incorrect metadata area header checksum
   Found duplicate PV uywoDlobnH0pbnr09dYuUWqB3A5kkh8M: using /dev/hdg1 
not /dev/hdc1
   Incorrect metadata area header checksum
   Incorrect metadata area header checksum
   Incorrect metadata area header checksum
   Found duplicate PV uywoDlobnH0pbnr09dYuUWqB3A5kkh8M: using /dev/hdg1 
not /dev/hdc1
   PV /dev/hdc1   VG data_vg   lvm2 [372,61 GB / 1,61 GB free]
   PV /dev/hda1                lvm2 [4,01 GB]
   Total: 2 [376,63 GB] / in use: 1 [372,61 GB] / in no VG: 1 [4,01 GB]

viking:/home/bernhard# lvscan -v
     Finding all logical volumes
   Incorrect metadata area header checksum
   Found duplicate PV uywoDlobnH0pbnr09dYuUWqB3A5kkh8M: using /dev/hdg1 
not /dev/hdc1
   ACTIVE            '/dev/data_vg/movies_lv' [200,00 GB] inherit
   ACTIVE            '/dev/data_vg/music_lv' [80,00 GB] inherit
   ACTIVE            '/dev/data_vg/backup_lv' [50,00 GB] inherit
   ACTIVE            '/dev/data_vg/ftp_lv' [40,00 GB] inherit
   ACTIVE            '/dev/data_vg/www_lv' [1,00 GB] inherit
viking:/home/bernhard# mount /dev/mapper/data_vg-ftp_lv /tmp


Jul  9 15:54:36 localhost kernel: md: bind<hdc1>
Jul  9 15:54:36 localhost kernel: md: bind<hdg1>
Jul  9 15:54:36 localhost kernel: raid5: device hdg1 operational as raid 
disk 2
Jul  9 15:54:36 localhost kernel: raid5: device hdc1 operational as raid 
disk 0
Jul  9 15:54:36 localhost kernel: RAID5 conf printout:
Jul  9 15:54:36 localhost kernel:  --- rd:3 wd:2 fd:1
Jul  9 15:54:36 localhost kernel:  disk 0, o:1, dev:hdc1
Jul  9 15:54:36 localhost kernel:  disk 2, o:1, dev:hdg1
Jul  9 15:54:53 localhost kernel: md: raidstart(pid 1950) used 
deprecated START_ARRAY ioctl. This will not be supported beyond 2.6
Jul  9 15:54:53 localhost kernel: md: could not import hdc1!
Jul  9 15:54:53 localhost kernel: md: autostart unknown-block(0,5633) 
failed!
Jul  9 15:54:53 localhost kernel: md: raidstart(pid 1950) used 
deprecated START_ARRAY ioctl. This will not be supported beyond 2.6
Jul  9 15:54:53 localhost kernel: md: could not import hdg1, trying to 
run array nevertheless.
Jul  9 15:54:53 localhost kernel: md: could not import hdc1, trying to 
run array nevertheless.
Jul  9 15:54:53 localhost kernel: md: autorun ...
Jul  9 15:54:53 localhost kernel: md: considering hde1 ...
Jul  9 15:54:53 localhost kernel: md:  adding hde1 ...
Jul  9 15:54:53 localhost kernel: md: md0 already running, cannot run hde1
Jul  9 15:54:53 localhost kernel: md: export_rdev(hde1)
Jul  9 15:54:53 localhost kernel: md: ... autorun DONE.


             reply	other threads:[~2004-07-09 20:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-09 20:16 Bernhard Dobbels [this message]
2004-07-09 21:38 ` raid5+ lvm2 disaster maarten van den Berg
     [not found] ` <1089415087.17625.200079546@webmail.messagingengine.com>
2004-07-12 22:33   ` Matthew (RAID)
2004-07-16 11:02     ` Bernhard Dobbels
2004-07-16 13:27 ` Luca Berra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40EEFD38.8080805@dobbels.com \
    --to=bernhard@dobbels.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).