* Re-assembling a software RAID in which device names have changed.
@ 2008-04-08 4:34 Sean H.
2008-04-08 17:25 ` Michael Tokarev
0 siblings, 1 reply; 6+ messages in thread
From: Sean H. @ 2008-04-08 4:34 UTC (permalink / raw)
To: linux-raid
To preface, I'm a fairly new Linux user, and have little experience
with RAID. I've taken this question several places and have yet to get
an answer which solves my problem, so I figured I'd come to the
experts.
I have a five-disk RAID 5, and one of the disks is failing. Every few
days it'll start buzzing, and will continue buzzing until the drive is
forced to spin down and back up, by either a restart or suspending to
RAM.
Now, today, I wanted to determine which disk was failing, so I
unmounted my array and unplugged drives - Specifically, three. The
third was the culprit, and I plugged the drives back in and rebooted.
The gift, or in this case, curse of my motherboard is that it
supports hot-swapping of SATA drives. So the drives didn't just
disappear inside the OS and reappear after a reboot. They disappeared
and re-appeared in the OS with incorrect /dev/sd* locations, and then
I rebooted.
I was unable to boot until I removed the line detailing the array
from my fstab. Now, when I manually 'mdadm --assemble /dev/md0' mdadm
finds the two untouched drives, then stops and tells me that they're
not enough to start the array.
So... How can I reassemble the array without knowing what order the
drives are in?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re-assembling a software RAID in which device names have changed.
2008-04-08 4:34 Re-assembling a software RAID in which device names have changed Sean H.
@ 2008-04-08 17:25 ` Michael Tokarev
2008-04-08 17:55 ` Sean H.
0 siblings, 1 reply; 6+ messages in thread
From: Michael Tokarev @ 2008-04-08 17:25 UTC (permalink / raw)
To: Sean H.; +Cc: linux-raid
Sean H. wrote:
> To preface, I'm a fairly new Linux user, and have little experience
> with RAID. I've taken this question several places and have yet to get
> an answer which solves my problem, so I figured I'd come to the
> experts.
>
> I have a five-disk RAID 5, and one of the disks is failing. Every few
> days it'll start buzzing, and will continue buzzing until the drive is
> forced to spin down and back up, by either a restart or suspending to
> RAM.
>
> Now, today, I wanted to determine which disk was failing, so I
> unmounted my array and unplugged drives - Specifically, three. The
> third was the culprit, and I plugged the drives back in and rebooted.
>
> The gift, or in this case, curse of my motherboard is that it
> supports hot-swapping of SATA drives. So the drives didn't just
> disappear inside the OS and reappear after a reboot. They disappeared
> and re-appeared in the OS with incorrect /dev/sd* locations, and then
> I rebooted.
>
> I was unable to boot until I removed the line detailing the array
> from my fstab. Now, when I manually 'mdadm --assemble /dev/md0' mdadm
> finds the two untouched drives, then stops and tells me that they're
> not enough to start the array.
>
> So... How can I reassemble the array without knowing what order the
> drives are in?
Don't list individual component devices in mdadm.conf.
Use array UUIDs instead.
I.e., instead of:
--- WRONG ---
ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1,/dev/sdc1
--- WRONG ---
use this:
--- RIGHT ---
ARRAY /dev/md1 UUID=11111111:22222222:33333333:44444444
--- RIGHT ---
Or use modern ways to assemble the arrays - such as homehost
(requires version 1 superblock).
In either case mdadm will find the relevant components (assuming
DEVICE line in mdadm.conf is correct) and figure out the right
order.
/mjt
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re-assembling a software RAID in which device names have changed.
2008-04-08 17:25 ` Michael Tokarev
@ 2008-04-08 17:55 ` Sean H.
2008-04-08 18:37 ` Michael Tokarev
0 siblings, 1 reply; 6+ messages in thread
From: Sean H. @ 2008-04-08 17:55 UTC (permalink / raw)
To: linux-raid, Michael Tokarev
On Tue, Apr 8, 2008 at 1:25 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> Sean H. wrote:
>
> > To preface, I'm a fairly new Linux user, and have little experience
> > with RAID. I've taken this question several places and have yet to get
> > an answer which solves my problem, so I figured I'd come to the
> > experts.
> >
> > I have a five-disk RAID 5, and one of the disks is failing. Every few
> > days it'll start buzzing, and will continue buzzing until the drive is
> > forced to spin down and back up, by either a restart or suspending to
> > RAM.
> >
> > Now, today, I wanted to determine which disk was failing, so I
> > unmounted my array and unplugged drives - Specifically, three. The
> > third was the culprit, and I plugged the drives back in and rebooted.
> >
> > The gift, or in this case, curse of my motherboard is that it
> > supports hot-swapping of SATA drives. So the drives didn't just
> > disappear inside the OS and reappear after a reboot. They disappeared
> > and re-appeared in the OS with incorrect /dev/sd* locations, and then
> > I rebooted.
> >
> > I was unable to boot until I removed the line detailing the array
> > from my fstab. Now, when I manually 'mdadm --assemble /dev/md0' mdadm
> > finds the two untouched drives, then stops and tells me that they're
> > not enough to start the array.
> >
> > So... How can I reassemble the array without knowing what order the
> > drives are in?
> >
>
> Don't list individual component devices in mdadm.conf.
> Use array UUIDs instead.
>
> I.e., instead of:
>
> --- WRONG ---
> ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1,/dev/sdc1
> --- WRONG ---
>
> use this:
>
> --- RIGHT ---
> ARRAY /dev/md1 UUID=11111111:22222222:33333333:44444444
> --- RIGHT ---
>
> Or use modern ways to assemble the arrays - such as homehost
> (requires version 1 superblock).
>
> In either case mdadm will find the relevant components (assuming
> DEVICE line in mdadm.conf is correct) and figure out the right
> order.
>
> /mjt
>
My mdadm.conf is configured with UUIDs:
# mdadm.conf written out by anaconda
DEVICE partitions
MAILADDR root
ARRAY /dev/md0 level=raid5 num-devices=5
uuid=58dcdaf3:bdf3f176:f2dd1b6b:f095c127
Tried the following: 'mdadm --assemble /dev/md0 --uuid
58dcdaf3:bdf3f176:f2dd1b6b:f095c127'
... and got this: mdadm: /dev/md0 assembled from 2 drives - not enough
to start the array.
(Which is what I've been getting for a while, now.)
It's possible to correct this issue by unplugging the three drives and
plugging them back in and rebooting, so the drives get their original
/dev/sd* locations, is it not? (Even if it it possible, I'd like to
learn how to fix problems like this at the software level over the
hardware level.)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re-assembling a software RAID in which device names have changed.
2008-04-08 17:55 ` Sean H.
@ 2008-04-08 18:37 ` Michael Tokarev
2008-04-08 19:06 ` Sean H.
0 siblings, 1 reply; 6+ messages in thread
From: Michael Tokarev @ 2008-04-08 18:37 UTC (permalink / raw)
To: Sean H.; +Cc: linux-raid
[Please respect the Reply-To header]
Sean H. wrote:
[]
> My mdadm.conf is configured with UUIDs:
Ok.
> DEVICE partitions
Ok.
> ARRAY /dev/md0 level=raid5 num-devices=5
> uuid=58dcdaf3:bdf3f176:f2dd1b6b:f095c127
> Tried the following: 'mdadm --assemble /dev/md0 --uuid
> 58dcdaf3:bdf3f176:f2dd1b6b:f095c127'
> ... and got this: mdadm: /dev/md0 assembled from 2 drives - not enough
> to start the array.
> (Which is what I've been getting for a while, now.)
Ok. So it's a different problem you have. What's the
reason you think it's due to re-numbering/naming of the
disks?
When you unplugged 3 your disks, I suspect linux noticed
that fact and md layer marked them as "failed" in the
array, with the 2 still here. Now, when you have all 5
of them again, 2 of them (the ones which were left in
the system) are "fresh", and 3 (the ones which were
removed) are "old". So you really don't have enough
fresh drives to start the array.
Now take a look at verbose output of mdadm (see -v option).
If my guess is right, use --force option. And take a look
at the Fine Manual, after all -- at the section describing
assemble mode.
> It's possible to correct this issue by unplugging the three drives and
> plugging them back in and rebooting, so the drives get their original
> /dev/sd* locations, is it not? (Even if it it possible, I'd like to
> learn how to fix problems like this at the software level over the
> hardware level.)
Please answer this question. Why do you think that the array
does not start because of disk renumbering?
/mjt
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re-assembling a software RAID in which device names have changed.
2008-04-08 18:37 ` Michael Tokarev
@ 2008-04-08 19:06 ` Sean H.
2008-04-08 20:19 ` Michal Soltys
0 siblings, 1 reply; 6+ messages in thread
From: Sean H. @ 2008-04-08 19:06 UTC (permalink / raw)
To: mjt; +Cc: linux-raid
On Tue, Apr 8, 2008 at 2:37 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> [Please respect the Reply-To header]
>
> Sean H. wrote:
> []
>
>
> > My mdadm.conf is configured with UUIDs:
> >
>
> Ok.
>
>
> > DEVICE partitions
> >
>
> Ok.
>
>
> > ARRAY /dev/md0 level=raid5 num-devices=5
> > uuid=58dcdaf3:bdf3f176:f2dd1b6b:f095c127
> >
>
>
>
> > Tried the following: 'mdadm --assemble /dev/md0 --uuid
> > 58dcdaf3:bdf3f176:f2dd1b6b:f095c127'
> > ... and got this: mdadm: /dev/md0 assembled from 2 drives - not enough
> > to start the array.
> > (Which is what I've been getting for a while, now.)
> >
>
> Ok. So it's a different problem you have. What's the
> reason you think it's due to re-numbering/naming of the
> disks?
>
> When you unplugged 3 your disks, I suspect linux noticed
> that fact and md layer marked them as "failed" in the
> array, with the 2 still here. Now, when you have all 5
> of them again, 2 of them (the ones which were left in
> the system) are "fresh", and 3 (the ones which were
> removed) are "old". So you really don't have enough
> fresh drives to start the array.
>
> Now take a look at verbose output of mdadm (see -v option).
> If my guess is right, use --force option. And take a look
> at the Fine Manual, after all -- at the section describing
> assemble mode.
>
>
>
> > It's possible to correct this issue by unplugging the three drives and
> > plugging them back in and rebooting, so the drives get their original
> > /dev/sd* locations, is it not? (Even if it it possible, I'd like to
> > learn how to fix problems like this at the software level over the
> > hardware level.)
> >
>
> Please answer this question. Why do you think that the array
> does not start because of disk renumbering?
>
> /mjt
>
>
Apologies. As I said, I'm new to mdadm / RAID.
I forced it to assemble, and it did. I was then able to mount the
array manually. /dev/sdf happens to be my OS drive, so for the
purposes of mdadm it's irrelevant. It appears you were correct in that
the drives were marked faulty - But the fact that the array was
started with four devices is troublesome because I lose any safety
gained from RAID 5.
Below the first command, and separated by ten hyphens is --detail
/dev/md0 which shows that the remaining device is marked "removed" and
not "failed".
I thank you for your help thusfar - You've allowed me to mount the
array and access my data. However, I would very much like to get my
RAID 5 back to non-degraded status ASAP.
[root@localhost ~]# mdadm --assemble /dev/md0 -v --force
mdadm: looking for devices for /dev/md0
mdadm: cannot open device /dev/sdf3: Device or resource busy
mdadm: /dev/sdf3 has wrong uuid.
mdadm: cannot open device /dev/sdf2: Device or resource busy
mdadm: /dev/sdf2 has wrong uuid.
mdadm: cannot open device /dev/sdf1: Device or resource busy
mdadm: /dev/sdf1 has wrong uuid.
mdadm: cannot open device /dev/sdf: Device or resource busy
mdadm: /dev/sdf has wrong uuid.
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 0.
mdadm: forcing event count in /dev/sdd1(3) from 2588 upto 2596
mdadm: forcing event count in /dev/sdb1(1) from 2559 upto 2596
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/sdb1
mdadm: clearing FAULTY flag for device 0 in /dev/md0 for /dev/sdd1
mdadm: added /dev/sdb1 to /dev/md0 as 1
mdadm: added /dev/sdc1 to /dev/md0 as 2
mdadm: added /dev/sdd1 to /dev/md0 as 3
mdadm: no uptodate device for slot 4 of /dev/md0
mdadm: added /dev/sda1 to /dev/md0 as 0
mdadm: /dev/md0 has been started with 4 drives (out of 5).
----------
[root@localhost ~]# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Wed Mar 5 15:55:52 2008
Raid Level : raid5
Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
Raid Devices : 5
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Tue Apr 8 14:58:02 2008
State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 58dcdaf3:bdf3f176:f2dd1b6b:f095c127
Events : 0.2612
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
4 0 0 4 removed
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Re-assembling a software RAID in which device names have changed.
2008-04-08 19:06 ` Sean H.
@ 2008-04-08 20:19 ` Michal Soltys
0 siblings, 0 replies; 6+ messages in thread
From: Michal Soltys @ 2008-04-08 20:19 UTC (permalink / raw)
To: Sean H.; +Cc: linux-raid
Sean H. wrote:
>
> Now, today, I wanted to determine which disk was failing, so I
> unmounted my array and unplugged drives - Specifically, three. The
> third was the culprit, and I plugged the drives back in and rebooted.
>
Just to be sure - did you do that on live array ? (it kinda looks like
you did)
>
> Below the first command, and separated by ten hyphens is --detail
> /dev/md0 which shows that the remaining device is marked "removed" and
> not "failed".
>
> I thank you for your help thusfar - You've allowed me to mount the
> array and access my data. However, I would very much like to get my
> RAID 5 back to non-degraded status ASAP.
>
You can add the drive/partition with --add option to the existing array.
It will start rebuilding then, and it can take a while.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-04-08 20:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-08 4:34 Re-assembling a software RAID in which device names have changed Sean H.
2008-04-08 17:25 ` Michael Tokarev
2008-04-08 17:55 ` Sean H.
2008-04-08 18:37 ` Michael Tokarev
2008-04-08 19:06 ` Sean H.
2008-04-08 20:19 ` Michal Soltys
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).