* Please help- raid1 recovery after disk failure
@ 2004-10-18 22:56 Konstantin Olchanski
2004-10-18 23:22 ` Mike Tran
2004-10-18 23:28 ` Guy
0 siblings, 2 replies; 5+ messages in thread
From: Konstantin Olchanski @ 2004-10-18 22:56 UTC (permalink / raw)
To: linux-raid
Dear Linux raiders- I ran into a problem with raid1 recovery after
a disk failure (running Fedora2, kernel 2.6.8-1.521smp).
1) I had a raid1 filesystem mirrored across /dev/hda2 and /dev/hdc2.
2) Disk hda died (unreadable sectors, fails SMART tests)
3) A new blank hda was installed and partitionned exactly like hdc.
4) I cannot restart and rebuild the raid1 volume because hdc2 is
in a funny "spare" state (see below)
How do I mark hdc2 as "active"?
Once "active", I assume then I will be able to restart md0,
hot-add /dev/hda2 as usual. (And the mirror will resync and rebuild itself?
Hopefully?)
[root@tw04 root]# mdadm -E /dev/hdc2
/dev/hdc2:
Magic : a92b4efc
Version : 00.90.00
UUID : aade8782:20122089:4f496788:228d85b9
Creation Time : Fri Oct 8 17:12:56 2004
Raid Level : raid1
Device Size : 124158208 (118.41 GiB 127.14 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Mon Oct 18 06:04:35 2004
State : clean, no-errors
Active Devices : 1
Working Devices : 2
Failed Devices : 1
Spare Devices : 1
Checksum : 7041db42 - correct
Events : 0.312429
Number Major Minor RaidDevice State
this 2 22 2 2 spare /dev/hdc2
0 0 3 2 0 active sync /dev/hda2
1 1 0 0 1 faulty removed
2 2 22 2 2 spare /dev/hdc2
[root@tw04 root]#
--
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Please help- raid1 recovery after disk failure
2004-10-18 22:56 Please help- raid1 recovery after disk failure Konstantin Olchanski
@ 2004-10-18 23:22 ` Mike Tran
2004-10-27 4:20 ` Konstantin Olchanski
2004-10-18 23:28 ` Guy
1 sibling, 1 reply; 5+ messages in thread
From: Mike Tran @ 2004-10-18 23:22 UTC (permalink / raw)
To: linux-raid
I would re-create md0 array with a missing disk as follows:
mdadm -C /dev/md0 -l 1 -n 2 /dev/hdc2 missing
Later you can hot add a disk to make it a normal 2-way mirror array.
--
Regards,
Mike T.
On Mon, 2004-10-18 at 17:56, Konstantin Olchanski wrote:
> Dear Linux raiders- I ran into a problem with raid1 recovery after
> a disk failure (running Fedora2, kernel 2.6.8-1.521smp).
>
> 1) I had a raid1 filesystem mirrored across /dev/hda2 and /dev/hdc2.
> 2) Disk hda died (unreadable sectors, fails SMART tests)
> 3) A new blank hda was installed and partitionned exactly like hdc.
> 4) I cannot restart and rebuild the raid1 volume because hdc2 is
> in a funny "spare" state (see below)
>
> How do I mark hdc2 as "active"?
> Once "active", I assume then I will be able to restart md0,
> hot-add /dev/hda2 as usual. (And the mirror will resync and rebuild itself?
> Hopefully?)
>
> [root@tw04 root]# mdadm -E /dev/hdc2
> /dev/hdc2:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : aade8782:20122089:4f496788:228d85b9
> Creation Time : Fri Oct 8 17:12:56 2004
> Raid Level : raid1
> Device Size : 124158208 (118.41 GiB 127.14 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 0
>
> Update Time : Mon Oct 18 06:04:35 2004
> State : clean, no-errors
> Active Devices : 1
> Working Devices : 2
> Failed Devices : 1
> Spare Devices : 1
> Checksum : 7041db42 - correct
> Events : 0.312429
>
>
> Number Major Minor RaidDevice State
> this 2 22 2 2 spare /dev/hdc2
> 0 0 3 2 0 active sync /dev/hda2
> 1 1 0 0 1 faulty removed
> 2 2 22 2 2 spare /dev/hdc2
> [root@tw04 root]#
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Please help- raid1 recovery after disk failure
2004-10-18 22:56 Please help- raid1 recovery after disk failure Konstantin Olchanski
2004-10-18 23:22 ` Mike Tran
@ 2004-10-18 23:28 ` Guy
2004-10-18 23:31 ` Guy
1 sibling, 1 reply; 5+ messages in thread
From: Guy @ 2004-10-18 23:28 UTC (permalink / raw)
To: 'Konstantin Olchanski', linux-raid
You must remove the bad disk first.
mdadm -r /dev/md2 /dev/hda2
Then add the new disk:
mdadm -a /dev/md2 /dev/hda2
Why are you using the obsolete raidtools which includes hot-add?
Is red hat still using the old stuff?
Guy
-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Konstantin Olchanski
Sent: Monday, October 18, 2004 6:57 PM
To: linux-raid@vger.kernel.org
Subject: Please help- raid1 recovery after disk failure
Dear Linux raiders- I ran into a problem with raid1 recovery after
a disk failure (running Fedora2, kernel 2.6.8-1.521smp).
1) I had a raid1 filesystem mirrored across /dev/hda2 and /dev/hdc2.
2) Disk hda died (unreadable sectors, fails SMART tests)
3) A new blank hda was installed and partitionned exactly like hdc.
4) I cannot restart and rebuild the raid1 volume because hdc2 is
in a funny "spare" state (see below)
How do I mark hdc2 as "active"?
Once "active", I assume then I will be able to restart md0,
hot-add /dev/hda2 as usual. (And the mirror will resync and rebuild itself?
Hopefully?)
[root@tw04 root]# mdadm -E /dev/hdc2
/dev/hdc2:
Magic : a92b4efc
Version : 00.90.00
UUID : aade8782:20122089:4f496788:228d85b9
Creation Time : Fri Oct 8 17:12:56 2004
Raid Level : raid1
Device Size : 124158208 (118.41 GiB 127.14 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Mon Oct 18 06:04:35 2004
State : clean, no-errors
Active Devices : 1
Working Devices : 2
Failed Devices : 1
Spare Devices : 1
Checksum : 7041db42 - correct
Events : 0.312429
Number Major Minor RaidDevice State
this 2 22 2 2 spare /dev/hdc2
0 0 3 2 0 active sync /dev/hda2
1 1 0 0 1 faulty removed
2 2 22 2 2 spare /dev/hdc2
[root@tw04 root]#
--
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Please help- raid1 recovery after disk failure
2004-10-18 23:28 ` Guy
@ 2004-10-18 23:31 ` Guy
0 siblings, 0 replies; 5+ messages in thread
From: Guy @ 2004-10-18 23:31 UTC (permalink / raw)
To: 'Guy', 'Konstantin Olchanski', linux-raid
Oops! hdc2!
Hopefully you know which disk was replaced! :)
-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Guy
Sent: Monday, October 18, 2004 7:28 PM
To: 'Konstantin Olchanski'; linux-raid@vger.kernel.org
Subject: RE: Please help- raid1 recovery after disk failure
You must remove the bad disk first.
mdadm -r /dev/md2 /dev/hda2
Then add the new disk:
mdadm -a /dev/md2 /dev/hda2
Why are you using the obsolete raidtools which includes hot-add?
Is red hat still using the old stuff?
Guy
-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Konstantin Olchanski
Sent: Monday, October 18, 2004 6:57 PM
To: linux-raid@vger.kernel.org
Subject: Please help- raid1 recovery after disk failure
Dear Linux raiders- I ran into a problem with raid1 recovery after
a disk failure (running Fedora2, kernel 2.6.8-1.521smp).
1) I had a raid1 filesystem mirrored across /dev/hda2 and /dev/hdc2.
2) Disk hda died (unreadable sectors, fails SMART tests)
3) A new blank hda was installed and partitionned exactly like hdc.
4) I cannot restart and rebuild the raid1 volume because hdc2 is
in a funny "spare" state (see below)
How do I mark hdc2 as "active"?
Once "active", I assume then I will be able to restart md0,
hot-add /dev/hda2 as usual. (And the mirror will resync and rebuild itself?
Hopefully?)
[root@tw04 root]# mdadm -E /dev/hdc2
/dev/hdc2:
Magic : a92b4efc
Version : 00.90.00
UUID : aade8782:20122089:4f496788:228d85b9
Creation Time : Fri Oct 8 17:12:56 2004
Raid Level : raid1
Device Size : 124158208 (118.41 GiB 127.14 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Mon Oct 18 06:04:35 2004
State : clean, no-errors
Active Devices : 1
Working Devices : 2
Failed Devices : 1
Spare Devices : 1
Checksum : 7041db42 - correct
Events : 0.312429
Number Major Minor RaidDevice State
this 2 22 2 2 spare /dev/hdc2
0 0 3 2 0 active sync /dev/hda2
1 1 0 0 1 faulty removed
2 2 22 2 2 spare /dev/hdc2
[root@tw04 root]#
--
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Please help- raid1 recovery after disk failure
2004-10-18 23:22 ` Mike Tran
@ 2004-10-27 4:20 ` Konstantin Olchanski
0 siblings, 0 replies; 5+ messages in thread
From: Konstantin Olchanski @ 2004-10-27 4:20 UTC (permalink / raw)
To: Mike Tran; +Cc: linux-raid
On Mon, Oct 18, 2004 at 06:22:05PM -0500, Mike Tran wrote:
> I would re-create md0 array with a missing disk as follows:
> mdadm -C /dev/md0 -l 1 -n 2 /dev/hdc2 missing
> Later you can hot add a disk to make it a normal 2-way mirror array.
Thanks for all the responses and suggestions- I was able to rebuild
my raid1 (mirror) array without losing any data. For the record,
this is what I did:
0) /dev/hdc2 (half-mirror) mounted as "/"
1) mdadm -C /dev/md0 ... /dev/hdc2 ...
did not work- says "hdc2" is busy. Maybe it's for the better.
2) reboot into the Fedora2 rescue CD, get the rescue root shell
3) make sure hdc2 is not mounted, md0 is not active (they are not)
4) mdadm -C /dev/md0 -l1 -n2 -c256 /dev/hdc2 missing
(may have warned about something)
5) mdadm --start /dev/md0, cat /proc/mdstat, mount /dev/md0 /mnt/tmp,
edit grub.conf (root=/dev/hdc2->/dev/md0), edit fstab (hdc2->md0).
Notes: /proc/mdstat shows two devices with status status [U_].
6) umount /dev/md0, mdadm --stop /dev/md0
7) reset, remove rescue CD
8) boot from the hard disk, note: md0 started, mounted as "/".
9) mdadm /dev/md0 -a /dev/hda2, note: resync starts automatically
10) wait for resync to complete, 160 Gbyte took about 90 minutes
11) /proc/mdstat shows status [UU].
12) shutdown, move loose disks into enclosures, close the box
13) boot: md0 comes up, status [UU], I am in business until the next
spurious read error (I am too lazy to roll a custom patched kernel,
I would rather wait until Red Hat apply the raid1 patches fixing
bug 136485).
K.O.
> On Mon, 2004-10-18 at 17:56, Konstantin Olchanski wrote:
> > Dear Linux raiders- I ran into a problem with raid1 recovery after
> > a disk failure (running Fedora2, kernel 2.6.8-1.521smp).
> >
> > 1) I had a raid1 filesystem mirrored across /dev/hda2 and /dev/hdc2.
> > 2) Disk hda died (unreadable sectors, fails SMART tests)
> > 3) A new blank hda was installed and partitionned exactly like hdc.
> > 4) I cannot restart and rebuild the raid1 volume because hdc2 is
> > in a funny "spare" state (see below)
> >
> > How do I mark hdc2 as "active"?
> > Once "active", I assume then I will be able to restart md0,
> > hot-add /dev/hda2 as usual. (And the mirror will resync and rebuild itself?
> > Hopefully?)
> >
> > [root@tw04 root]# mdadm -E /dev/hdc2
> > /dev/hdc2:
> > Magic : a92b4efc
> > Version : 00.90.00
> > UUID : aade8782:20122089:4f496788:228d85b9
> > Creation Time : Fri Oct 8 17:12:56 2004
> > Raid Level : raid1
> > Device Size : 124158208 (118.41 GiB 127.14 GB)
> > Raid Devices : 2
> > Total Devices : 2
> > Preferred Minor : 0
> >
> > Update Time : Mon Oct 18 06:04:35 2004
> > State : clean, no-errors
> > Active Devices : 1
> > Working Devices : 2
> > Failed Devices : 1
> > Spare Devices : 1
> > Checksum : 7041db42 - correct
> > Events : 0.312429
> >
> >
> > Number Major Minor RaidDevice State
> > this 2 22 2 2 spare /dev/hdc2
> > 0 0 3 2 0 active sync /dev/hda2
> > 1 1 0 0 1 faulty removed
> > 2 2 22 2 2 spare /dev/hdc2
> > [root@tw04 root]#
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-10-27 4:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-18 22:56 Please help- raid1 recovery after disk failure Konstantin Olchanski
2004-10-18 23:22 ` Mike Tran
2004-10-27 4:20 ` Konstantin Olchanski
2004-10-18 23:28 ` Guy
2004-10-18 23:31 ` Guy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).