Can't work out how to recover after multiple failures

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Can't work out how to recover after multiple failures
@ 2005-07-24 16:22 J. Ali Harlow
  2005-07-26  8:51 ` J. Ali Harlow
  0 siblings, 1 reply; 2+ messages in thread
From: J. Ali Harlow @ 2005-07-24 16:22 UTC (permalink / raw)
  To: linux-raid

Can anyone please help me get my raid 1 device back up and running? I
had a chipset failure which took out one disk and then just after
replacing the disk and resyncing, the original disk lost power due to a
loose connection. mdadm seems to think that both devices are fine, but I
can't seem to fine the magic vodoo to get the raid array working again.
Sorry if this is a really stupid question. I did read the documentation
for mdadm pretty carefully, but I'm obviously missing something. Many
thanks,

Ali.

# dmseg | fgrep md
md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: raid1 personality registered as nr 3
md: Autodetecting RAID arrays.
md: autorun ...
md: considering hde2 ...
md:  adding hde2 ...
md:  adding hda2 ...
md: created md0
md: bind<hda2>
md: bind<hde2>
md: running: <hde2><hda2>
md: kicking non-fresh hda2 from array!
md: unbind<hda2>
md: export_rdev(hda2)
raid1: no operational mirrors for md0
md: pers->run() failed ...
md: do_md_run() returned -22
md: md0 stopped.
md: unbind<hde2>
md: export_rdev(hde2)
md: ... autorun DONE.
md: Autodetecting RAID arrays.
md: autorun ...
md: considering hde2 ...
md:  adding hde2 ...
md:  adding hda2 ...
md: created md0
md: bind<hda2>
md: bind<hde2>
md: running: <hde2><hda2>
md: kicking non-fresh hda2 from array!
md: unbind<hda2>
md: export_rdev(hda2)
raid1: no operational mirrors for md0
md: pers->run() failed ...
md: do_md_run() returned -22
md: md0 stopped.
md: unbind<hde2>
md: export_rdev(hde2)
md: ... autorun DONE.
md: md0 stopped.
md: bind<hda2>
md: bind<hde2>
md: kicking non-fresh hda2 from array!
md: unbind<hda2>
md: export_rdev(hda2)
raid1: no operational mirrors for md0
md: pers->run() failed ...
# mdadm --assemble /dev/md0 /dev/hda2 /dev/hde2
# more /proc/mdstat
Personalities : [raid1]
md0 : inactive hde2[2]
      23535104 blocks
unused devices: <none>
# mdadm -D /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Sat May  7 00:03:45 2005
     Raid Level : raid1
    Device Size : 23535104 (22.44 GiB 24.10 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sat Jul 23 08:25:16 2005
          State : dirty, degraded
 Active Devices : 0
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 2


    Number   Major   Minor   RaidDevice State
       0       3        2        0      spare   /dev/hda2
       1       0        0       -1      removed
       2      33        2       -1      spare   /dev/hde2
# mdadm -E /dev/hda2
/dev/hda2:
          Magic : a92b4efc
        Version : 00.90.01
           UUID : aeff1135:511fe13c:e598e943:0907531e
  Creation Time : Sat May  7 00:03:45 2005
     Raid Level : raid1
    Device Size : 23535104 (22.44 GiB 24.10 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Sat Jul 23 08:24:15 2005
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 1ebddac8 - correct
         Events : 0.469741


      Number   Major   Minor   RaidDevice State
this     0       3        2        0      active sync   /dev/hda2
   0     0       3        2        0      active sync   /dev/hda2
   1     1       0        0        1      faulty removed
   2     2      33        2        1      spare   /dev/hde2
# mdadm -E /dev/hde2
/dev/hde2:
          Magic : a92b4efc
        Version : 00.90.01
           UUID : aeff1135:511fe13c:e598e943:0907531e
  Creation Time : Sat May  7 00:03:45 2005
     Raid Level : raid1
    Device Size : 23535104 (22.44 GiB 24.10 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Sat Jul 23 08:25:16 2005
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 1ec3a142 - correct
         Events : 0.658941


      Number   Major   Minor   RaidDevice State
this     2      33        2        2      spare   /dev/hde2
   0     0       3        2        0      active sync   /dev/hda2
   1     1       0        0        1      faulty removed
   2     2      33        2        2      spare   /dev/hde2

-- 
http://www.fastmail.fm - Choose from over 50 domains or use your own


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Can't work out how to recover after multiple failures
  2005-07-24 16:22 Can't work out how to recover after multiple failures J. Ali Harlow
@ 2005-07-26  8:51 ` J. Ali Harlow
  0 siblings, 0 replies; 2+ messages in thread
From: J. Ali Harlow @ 2005-07-26  8:51 UTC (permalink / raw)
  To: linux-raid


On Sun, 24 Jul 2005 17:22:21 +0100, "J. Ali Harlow" <ali@juiblex.co.uk>
said:
> Can anyone please help me get my raid 1 device back up and running? I
> had a chipset failure which took out one disk and then just after
> replacing the disk and resyncing, the original disk lost power due to a
> loose connection. mdadm seems to think that both devices are fine, but I
> can't seem to fine the magic vodoo to get the raid array working again.
> Sorry if this is a really stupid question. I did read the documentation
> for mdadm pretty carefully, but I'm obviously missing something. Many
> thanks,

Finally got this working. There seem to have been two issues. First,
mdadm didn't like running an array as a seperate step to assembling it.
(mdadm -R /dev/md0 was giving me "Invalid Argument"). Doing this in one
step however did work. Secondly, I suspect the two devices weren't
synced even though mdadm -E wasn't showing this which was casuing
attempts to assemble an array containing both devices to quietly fail.
The final sequence which worked was:

reboot
mdadm --assemble /dev/md0 --run /dev/hda2
mdadm /dev/md0 -a /dev/hde2

Easy when you know :-)

Ali.

-- 
http://www.fastmail.fm - Same, same, but different…

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-07-26  8:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-24 16:22 Can't work out how to recover after multiple failures J. Ali Harlow
2005-07-26  8:51 ` J. Ali Harlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).